summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Mips: Don't create copy of nothingMatt Arsenault2019-03-211-2/+0
| | | | | | | | | This was creating a copy of the register the pseudo itself was def'ing, leaving a copy of an undefined register. I'm not sure how the verifier is not catching this, but this avoids asserting in a future change to RegAllocFast llvm-svn: 356716
* [AArch64] Update for ExynosEvandro Menezes2019-03-211-1/+1
| | | | | | Fix the feature set for Exynos M4 by removing support for `+fp16fml` and fix test case. llvm-svn: 356698
* [X86] canonicalizeBitSelect - don't attempt to canonicalize mask registersSimon Pilgrim2019-03-211-1/+1
| | | | | | | | We don't use X86ISD::ANDNP for mask registers. Test case from @craig.topper (Craig Topper) llvm-svn: 356696
* [X86] Don't avoid folding multiple use sign extended 8-bit immediate into ↵Craig Topper2019-03-213-18/+5
| | | | | | | | | | | | instructions under optsize. Under optsize we try to avoid folding immediates into instructions under optsize. But if the immediate is 16-bits or 32 bits, but can be encoded as an 8-bit immediate we don't save enough from disabling the folding unless the immediate has enough uses to make up for the size of the move which is either 3 bytes or 5 bytes since there are no sign extended 8-bit moves. We would also save something if the immediate was a live out of the basic block and thus a move was unavoidable, but that would require a more advanced heuristic than just counting uses. Note we only avoid folding multiple use immediates into the patterns that use X86ISD::ADD/SUB/XOR/OR/AND/CMP/ADC/SBB nodes and not the more common ISD::ADD/SUB/XOR/OR/AND nodes. Differential Revision: https://reviews.llvm.org/D59522 llvm-svn: 356688
* [ScalarizeMaskedMemIntrin] Add support for scalarizing expandload and ↵Craig Topper2019-03-212-0/+30
| | | | | | | | | | | | | | compressstore intrinsics. This adds support for scalarizing these intrinsics as well the X86TargetTransformInfo support to avoid scalarizing them in the cases X86 can handle. I've omitted handling special cases for constant masks for this first pass. Though CodeGenPrepare can constant fold the branch conditions and remove some of the control flow anyway. Fixes PR40994 and is covers most of PR3666. Might want to implement constant masks to close that. Differential Revision: https://reviews.llvm.org/D59180 llvm-svn: 356687
* [Thumb] Fix infinite loop in ABS expansion (PR41160)Simon Pilgrim2019-03-211-1/+4
| | | | | | Don't expand ISD::ABS node if its legal. llvm-svn: 356661
* [AMDGPU] Support for v3i32/v3f32Tim Renouf2019-03-2112-52/+279
| | | | | | | | | | | | | | | Added support for dwordx3 for most load/store types, but not DS, and not intrinsics yet. SI (gfx6) does not have dwordx3 instructions, so they are not enabled there. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58902 Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9 llvm-svn: 356659
* [AArch64] Allow -mattr=tpidr-el[1|2|3]Oliver Stannard2019-03-213-0/+21
| | | | | | | | | | | Added subtarget features for AArch64 to use TPIDR_EL[1|2|3] as the TLS base register, rather than the default TPIDR_EL0. Patch by Philip Derrin! Differential revision: https://reviews.llvm.org/D54685 llvm-svn: 356657
* [X86] Add CMPXCHG8B feature flag. Set it for all CPUs except i386/i486 ↵Craig Topper2019-03-205-48/+92
| | | | | | | | | | | | including 'generic'. Disable use of CMPXCHG8B when this flag isn't set. CMPXCHG8B was introduced on i586/pentium generation. If its not enabled, limit the atomic width to 32 bits so the AtomicExpandPass will expand to lib calls. Unclear if we should be using a different limit for other configs. The default is 1024 and experimentation shows that using an i256 atomic will cause a crash in SelectionDAG. Differential Revision: https://reviews.llvm.org/D59576 llvm-svn: 356631
* [WebAssembly][NFC] Fix formatting error from rL356610Thomas Lively2019-03-201-2/+3
| | | | llvm-svn: 356622
* [AMDGPU] Do not generate spurious PAL metadataTim Renouf2019-03-202-6/+10
| | | | | | | | | | | | My previous fix rL356591 "[AMDGPU] Added MsgPack format PAL metadata" accidentally caused a spurious PAL metadata .note record to be emitted for any AMDGPU output. That caused failures in the lld test amdgpu-relocs.s. Fixed. Differential Revision: https://reviews.llvm.org/D59613 Change-Id: Ie04a2aaae890dcd490f22c89edf9913a77ce070e llvm-svn: 356621
* [X86] Call lowerShuffleAsBitMask for 512-bit vectors in lowerShuffleAsBlend.Craig Topper2019-03-201-11/+43
| | | | | | | | | | | | | | | | | This patch enables the use of lowerShuffleAsBitMask for 512-bit blends before falling back to move immedate, GPR to k-register, and masked op. I had to make some changes to support v8i64 when i64 is not a legal type. And to support floating point types. This trades a load for the move immediate and GPR move which is higher latency. But its probably better for register pressure not having to hop through other register classes. The load+and should play better with LICM and rematerialization I think. Differential Revision: https://reviews.llvm.org/D59479 llvm-svn: 356618
* [AMDGPU] Fix dependency on `BinaryFormat`Michael Liao2019-03-201-1/+1
| | | | | | | | | | | | Summary: - The linking is broken when this library is built as shared one. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59610 llvm-svn: 356617
* AMDGPU: Don't look for constant in insert/extract_vector_elt regbankselectMatt Arsenault2019-03-201-44/+19
| | | | | | | | | The constantness shouldn't change the register bank choice. We also don't need to restrict this to only indexing VGPRs, since it's possible to index SGPRs (but SelectionDAG made using this difficult). Allow directly indexing SGPRs when appropriate. llvm-svn: 356611
* [WebAssembly] Target features sectionThomas Lively2019-03-202-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implements a new target features section in assembly and object files that records what features are used, required, and disallowed in WebAssembly objects. The linker uses this information to ensure that all objects participating in a link are feature-compatible and records the set of used features in the output binary for use by optimizers and other tools later in the toolchain. The "atomics" feature is always required or disallowed to prevent linking code with stripped atomics into multithreaded binaries. Other features are marked used if they are enabled globally or on any function in a module. Future CLs will add linker flags for ignoring feature compatibility checks and for specifying the set of allowed features, implement using the presence of the "atomics" feature to control the type of memory and segments in the linked binary, and add front-end flags for relaxing the linkage policy for atomics. Reviewers: aheejin, sbc100, dschuff Subscribers: jgravelle-google, hiraditya, sunfish, mgrang, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59173 llvm-svn: 356610
* [AMDGPU] Fix clamp bit DAG operandMichael Liao2019-03-201-5/+8
| | | | | | | | | | | | | | | | | | Summary: - Should use `targetconstant` instead of `constant` operand for clamp bit, which is expected as an immediate operand. Under certain conditions, such as a common `i1 false` constant is used in other place and selected before the instruction with clamp bit, register operand may be added instead of immediate one. Use `targetcosntant` to enforce that. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59608 llvm-svn: 356608
* [ARC] Add ARCOptAddrMode pass to generate postincrement loads/stores.Pete Couperus2019-03-205-2/+514
| | | | | | | | | | Build on newly introduced ARC postincrement loads/stores from r356200. Patch By Denis Antrushin! <denis@synopsys.com> Differential Revision: https://reviews.llvm.org/D59409 llvm-svn: 356606
* AMDHSA: Fix COMPUTE_PGM_RSRC2.USER_SGPR calculation when parsing ISA assemblyKonstantin Zhuravlyov2019-03-201-7/+7
| | | | | | | | It must match https://llvm.org/docs/AMDGPUUsage.html#initial-kernel-execution-state Differential Revision: https://reviews.llvm.org/D59570 llvm-svn: 356603
* [ARM] Eliminate redundant "mov rN, sp" instructions in Thumb1.Eli Friedman2019-03-201-13/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This takes sequences like "mov r4, sp; str r0, [r4]", and optimizes them to something like "str r0, [sp]". For regular stack variables, this optimization was already implemented: we lower loads and stores using frame indexes, which are expanded later. However, when constructing a call frame for a call with more than four arguments, the existing optimization doesn't apply. We need to use stores which are actually relative to the current value of sp, and don't have an associated frame index. This patch adds a special case to handle that construct. At the DAG level, this is an ISD::STORE where the address is a CopyFromReg from SP (plus a small constant offset). This applies only to Thumb1: in Thumb2 or ARM mode, a regular store instruction can access SP directly, so the COPY gets eliminated by existing code. The change to ARMDAGToDAGISel::SelectThumbAddrModeSP is a related cleanup: we shouldn't pretend that it can select anything other than frame indexes. Differential Revision: https://reviews.llvm.org/D59568 llvm-svn: 356601
* [AMDGPU] Added MsgPack format PAL metadataTim Renouf2019-03-205-55/+636
| | | | | | | | | | | | | | Summary: PAL metadata now supports both the old linear reg=val pairs format and the new MsgPack format. The MsgPack format uses YAML as its textual representation. On output to YAML, a mnemonic name is provided for some hardware registers. Differential Revision: https://reviews.llvm.org/D57028 Change-Id: I2bbaabaaca4b3574f7e03b80fbef7c7a69d06a94 llvm-svn: 356591
* [AMDGPU] Factored PAL metadata handling out into its own classTim Renouf2019-03-208-129/+345
| | | | | | | | | | | | | | | | | | | | | | Summary: This commit introduces a new AMDGPUPALMetadata class that: * is inside the AMDGPU target; * keeps an in-memory representation of PAL metadata; * provides a method to read the frontend-supplied metadata from LLVM IR; * provides methods for the asm printer to set metadata items; * provides methods to write the metadata as a binary blob to put in a .note record or as an asm directive; * provides a method to read the metadata as a binary blob from a .note record. Because llvm-readobj cannot call directly into a target, I had to remove llvm-readobj's ability to dump PAL metadata, pending a resolution to https://reviews.llvm.org/D52821 Differential Revision: https://reviews.llvm.org/D57027 Change-Id: I756dc830894fcb6850324cdcfa87c0120eb2cf64 llvm-svn: 356582
* [AMDGPU][MC] Corrected checks for DS offset0 rangeDmitry Preobrazhensky2019-03-201-1/+1
| | | | | | | | | | See bug 40889: https://bugs.llvm.org/show_bug.cgi?id=40889 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D59313 llvm-svn: 356576
* [AMDGPU][MC][GFX9] Added support of operands shared_base, shared_limit, ↵Dmitry Preobrazhensky2019-03-206-8/+76
| | | | | | | | | | | | private_base, private_limit, pops_exiting_wave_id See bug 39297: https://bugs.llvm.org/show_bug.cgi?id=39297 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D59290 llvm-svn: 356561
* [X86] Use getConstantOperandAPInt to detect out-of-range shifts.Simon Pilgrim2019-03-201-5/+5
| | | | llvm-svn: 356549
* [X86] Remove X86 specific dag nodes for RDTSC/RDTSCP/RDPMC. NFCIAndrea Di Biagio2019-03-205-144/+66
| | | | | | | | | | | | | | | | | | | | | | | | This patch removes the following dag node opcodes from namespace X86ISD: RDTSC_DAG, RDTSCP_DAG, RDPMC_DAG The logic that expands RDTSC/RDPMC/XGETBV intrinsics is basically the same. The only differences are: RDTSC/RDTSCP don't implicitly read ECX. RDTSCP also implicitly writes ECX. I moved the common expansion logic into a helper function with the goal to get rid of code repetition. That helper is now used for the expansion of RDTSC/RDTSCP/RDPMC/XGETBV intrinsics. No functional change intended. Differential Revision: https://reviews.llvm.org/D59547 llvm-svn: 356546
* [AMDGPU] Allow MIMG with no uses in adjustWritemask in iselDavid Stuttard2019-03-201-0/+4
| | | | | | | | | | | | | | | | | | | Summary: If an MIMG instruction has managed to get through to adjustWritemask in isel but has no uses (and doesn't enable TFC) then prevent an assertion by not attempting to adjust the writemask. The instruction will be removed anyway. Change-Id: I9a5dba6bafe1f35ac99c1b73df390936e2ac27a7 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58964 llvm-svn: 356540
* [X86] Re-disable cmpxchg16b for 32-bit mode assembly parsing.Craig Topper2019-03-192-3/+4
| | | | | | This was broken recently when I factored the 64 bit mode check into hasCmpxchg16 without thinking about the AssemblerPredicate. llvm-svn: 356531
* [ARM] Make sure to save/restore LR when we use tBfar.Eli Friedman2019-03-193-3/+19
| | | | | | | | | | | | | | | | This change does two things. One, it ensures compilation will abort instead of miscompiling if ARMFrameLowering::determineCalleeSaves chooses not to save LR in a case where it's necessary. Two, it changes the way we estimate the size of a function to be more conservative in the presence of constant pool entries and jump tables. EstimateFunctionSizeInBytes probably still isn't really conservative enough, but I'm not sure how we can come up with a reliable estimate before constant islands runs. Differential Revision: https://reviews.llvm.org/D59439 llvm-svn: 356527
* [AArch64][GlobalISel] Add an optimization to select vector DUP instructions.Amara Emerson2019-03-191-0/+105
| | | | | | | | | This adds pattern matching for the insert+shufflevector sequence so we can generate dup instructions instead of the current TBL sequence. Differential Revision: https://reviews.llvm.org/D59558 llvm-svn: 356526
* [AArch64][GlobalISel] Make v4s32 G_IMPLICIT_DEF legal.Amara Emerson2019-03-191-1/+1
| | | | llvm-svn: 356525
* CodeGen: Refactor regallocator command line and target selectionMatt Arsenault2019-03-193-13/+24
| | | | | | | | | | This will allow targets more flexibility to replace the register allocator core passes. In a future commit, AMDGPU will run the core register assignment passes twice, and will also want to disallow using the standard -regalloc option. llvm-svn: 356506
* Fix for ABS legalization on PPC buildbot.Simon Pilgrim2019-03-191-2/+3
| | | | llvm-svn: 356498
* [X86][SSE] SimplifyDemandedVectorEltsForTargetNode - handle repeated shift ↵Simon Pilgrim2019-03-191-1/+11
| | | | | | | | amounts If a value with multiple uses is only ever used for SSE shift amounts then we know that only the bottom 64-bits are needed. llvm-svn: 356483
* [MIPS][microMIPS] Enable dynamic stack realignmentSimon Atanasyan2019-03-191-2/+2
| | | | | | | | | | | | Dynamic stack realignment was disabled on micromips by checking if target has standard encoding. We simply change the condition to skip Mips16 only. Patch by Mirko Brkusanin. Differential Revision: http://reviews.llvm.org/D59499 llvm-svn: 356478
* [NFC] Fix unused variable in release buildsJordan Rupprecht2019-03-191-2/+2
| | | | | | This was introduced in rL356468. llvm-svn: 356477
* Fix unused variable warning. NFCI.Simon Pilgrim2019-03-191-2/+1
| | | | llvm-svn: 356474
* [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in ↵Simon Pilgrim2019-03-194-66/+97
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SelectionDAGBuilder::visitSelect These changes are related to PR37743 and include: SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node. Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner. Add promoting the integer ABS node in the LegalizeIntegerType. Expand-based legalization of integer result for the ABS nodes. Expand-based legalization of ABS vector operations. Add some integer abs testcases for different typesizes for Thumb arch Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to: tmp = (SRA, Hi, 31) Lo = (UADDO tmp, Lo) Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1)) Lo = (XOR tmp, Lo) The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern: (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))). Change integer abs testcases for codegen with the ABS node support for AArch64. Indicate that the ABS is legal for the i64 type when the NEON is supported. Change the integer abs testcases to show changing of codegen. Add combine and legalization of ABS nodes for Thumb arch. Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition. For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743 Patch by: @ikulagin (Ivan Kulagin) Differential Revision: https://reviews.llvm.org/D49837 llvm-svn: 356468
* [AMDGPU] Add buffer/load 8/16 bit overloaded intrinsicsRyan Taylor2019-03-196-1/+170
| | | | | | | | | | | | | | | Summary: Add buffer store/load 8/16 overloaded intrinsics for buffer, raw_buffer and struct_buffer Change-Id: I166a29f071b2ff4e4683fb0392564b1f223ac61d Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59265 llvm-svn: 356465
* [AMDGPU] Ban i8 min3 promotion.Neil Henning2019-03-191-3/+3
| | | | | | | | | | | | | | | | I found this really weird WWM-related case whereby through the WWM transformations our isel lowering was trying to promote 2 min's into a min3 for the i8 type, which our hardware doesn't support. The new min3_i8.ll test case would previously spew the error: PromoteIntegerResult #0: t69: i8 = SMIN3 t70, Constant:i8<0>, t68 Before the simple fix to our isel lowering to not do it for i8 MVT's. Differential Revision: https://reviews.llvm.org/D59543 llvm-svn: 356464
* [mips] Fix crash on recursive using of .setSimon Atanasyan2019-03-191-10/+9
| | | | | | | | | | | | | Switch to the `MCParserUtils::parseAssignmentExpression` for parsing assignment expressions in the `.set` directive reduces code and allows to print an error message instead of crashing in case of incorrect recursive using of the `.set`. Fix for the bug https://bugs.llvm.org/show_bug.cgi?id=41053. Differential Revision: http://reviews.llvm.org/D59452 llvm-svn: 356461
* [WebAssembly] Small improvements in FixIrreducibleControlFlow (NFC)Heejin Ahn2019-03-191-24/+16
| | | | | | | | | | | | | | | | | Summary: - Make some class member methods const - Delete unnecessary includes - Use a simpler form of `BuildMI` Reviewers: kripken Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59454 llvm-svn: 356440
* [WebAssembly] Rename methods according to instruction name changes (NFC)Heejin Ahn2019-03-191-7/+7
| | | | | | | | | | | | Reviewers: tlively, sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59469 llvm-svn: 356438
* [WebAssembly] Lower SIMD nnan setcc nodesThomas Lively2019-03-191-10/+12
| | | | | | | | | | | | | | | | | | Summary: Adds patterns to lower all the remaining setcc modes: lt, gt, le, and ge. Fixes PR40912. Reviewers: aheejin, sbc100, dschuff Reviewed By: dschuff Subscribers: jgravelle-google, hiraditya, sunfish, jdoerfert, llvm-commits, srj Tags: #llvm Differential Revision: https://reviews.llvm.org/D59519 llvm-svn: 356431
* [X86] Disable CQTO and CLTQ instructions in the assembly parser outside ↵Craig Topper2019-03-181-2/+2
| | | | | | 64-bit mode. llvm-svn: 356419
* [X86] Allow any 8-bit immediate to be used with BT/BTC/BTR/BTS not just sign ↵Craig Topper2019-03-181-30/+46
| | | | | | | | extended 8-bit immediates. We need to allow [128,255] in addition to [-128, 127] to match gas. llvm-svn: 356413
* [WebAssembly] Don't override default implementation of isOffsetFoldingLegal. ↵Sam Clegg2019-03-183-8/+1
| | | | | | | | | | | | | | | NFC. The default implementation does we want and is going to more compatible with dynamic linking (-fPIC) support that is planned. This is NFC because currently we only build wasm with `-relocation-model=static` which in turn means that the default `isOffsetFoldingLegal` always returns true today. Differential Revision: https://reviews.llvm.org/D54661 llvm-svn: 356410
* [X86] Use relocImm in the ROL8ri/ROL16ri/ROL32ri/ROL64ri patterns to be ↵Craig Topper2019-03-181-4/+6
| | | | | | consistent with the ROR patterns. llvm-svn: 356407
* [X86] Replace uses of i64immSExt32_su with i64relocImmSExt32_su.Craig Topper2019-03-183-7/+2
| | | | | | For the i8, i16, and i32 instructions we were using a relocImm. Presumably we should for i64 as well. llvm-svn: 356406
* [AMDGPU] Enable code selection using `s_mul_hi_u32`/`s_mul_hi_i32`.Michael Liao2019-03-182-2/+10
| | | | | | | | | | Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59501 llvm-svn: 356405
* [AMDGPU] Asm/disasm clamp modifier on vop3 int arithmeticTim Renouf2019-03-186-33/+69
| | | | | | | | | | | | | | Allow the clamp modifier on vop3 int arithmetic instructions in assembly and disassembly. This involved adding a clamp operand to the affected instructions in MIR and MC, and thus having to fix up several places in codegen and MIR tests. Differential Revision: https://reviews.llvm.org/D59267 Change-Id: Ic7775105f02a985b668fa658a0cd7837846a534e llvm-svn: 356399
OpenPOWER on IntegriCloud