summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [AMDGPU] gfx908 hazard recognizerStanislav Mekhanoshin2019-07-112-1/+233
| | | | | | Differential Revision: https://reviews.llvm.org/D64593 llvm-svn: 365829
* [AMDGPU] gfx908 schedulingStanislav Mekhanoshin2019-07-113-0/+163
| | | | | | Differential Revision: https://reviews.llvm.org/D64590 llvm-svn: 365826
* [AMDGPU] gfx908 mfma supportStanislav Mekhanoshin2019-07-1116-62/+548
| | | | | | Differential Revision: https://reviews.llvm.org/D64584 llvm-svn: 365824
* [WebAssembly] Assembler: support negative float constants.Wouter van Oortmerssen2019-07-111-12/+27
| | | | | | | | | | | | Reviewers: dschuff Subscribers: sbc100, jgravelle-google, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64367 llvm-svn: 365802
* [NVPTX] Use atomicrmw fadd instead of intrinsicsBenjamin Kramer2019-07-113-21/+12
| | | | | | AutoUpgrade the old intrinsics to atomicrmw fadd. llvm-svn: 365796
* [X86] Merge negated ISD::SUB nodes into X86ISD::SUB equivalent (PR40483)Sanjay Patel2019-07-111-7/+7
| | | | | | | | | | | Follow up to D58597, where it was noted that the commuted ISD::SUB variant was having problems with lack of combines. See also D63958 where we untangled setcc/sub pairs. Differential Revision: https://reviews.llvm.org/D58875 llvm-svn: 365791
* AMDGPU/GlobalISel: Move kernel argument handling to separate functionMatt Arsenault2019-07-112-42/+61
| | | | llvm-svn: 365782
* OpaquePtr: switch to GlobalValue::getValueType in a few places. NFC.Tim Northover2019-07-111-5/+2
| | | | llvm-svn: 365770
* [X86] -fno-plt: use GOT __tls_get_addr only if GOTPCRELX is enabledFangrui Song2019-07-111-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: As of binutils 2.32, ld has a bogus TLS relaxation error when the GD/LD code sequence using R_X86_64_GOTPCREL (instead of R_X86_64_GOTPCRELX) is attempted to be relaxed to IE/LE (binutils PR24784). gold and lld are good. In gcc/config/i386/i386.md, there is a configure-time check of as/ld support and the GOT relaxation will not be used if as/ld doesn't support it: if (flag_plt || !HAVE_AS_IX86_TLS_GET_ADDR_GOT) return "call\t%P2"; return "call\t{*%p2@GOT(%1)|[DWORD PTR %p2@GOT[%1]]}"; In clang, -DENABLE_X86_RELAX_RELOCATIONS=OFF is the default. The ld.bfd bogus error can be reproduced with: thread_local int a; int main() { return a; } clang -fno-plt -fpic a.cc -fuse-ld=bfd GOTPCRELX gained relative good support in 2016, which is considered relatively new. It is even difficult to conditionally default to -DENABLE_X86_RELAX_RELOCATIONS=ON due to cross compilation reasons. So work around the ld.bfd bug by only using GOT when GOTPCRELX is enabled. Reviewers: dalias, hjl.tools, nikic, rnk Reviewed By: nikic Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64304 llvm-svn: 365752
* [ARM][LowOverheadLoops] Correct offset checkingSam Parker2019-07-113-11/+29
| | | | | | | | | | | | | | | | This patch addresses a couple of problems: 1) The maximum supported offset of LE is -4094. 2) The offset of WLS also needs to be checked, this uses a maximum positive offset of 4094. The use of BasicBlockUtils has been changed because the block offsets weren't being initialised, but the isBBInRange checks both positive and negative offsets. ARMISelLowering has been tweaked because the test case presented another pattern that we weren't supporting. llvm-svn: 365749
* [ARM] Remove nonexistent unsigned forms of MVE VQDMLAH.Simon Tatham2019-07-111-3/+0
| | | | | | | | | | | | | | | | | | | | The VQDMLAH.U8, VQDMLAH.U16 and VQDMLAH.U32 instructions don't actually exist: the Armv8.1-M architecture spec only lists signed forms of that instruction. The unsigned ones were added in error: they existed in an early draft of the spec, but they were removed before the public version, and we missed that particular spec change. Also affects the variant forms VQDMLASH, VQRDMLAH and VQRDMLASH. Reviewers: miyuki Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64502 llvm-svn: 365747
* [MIPS GlobalISel] Skip copies in addUseDef and addDefUsesPetar Avramovic2019-07-112-11/+48
| | | | | | | | | | | | Skip copies between virtual registers during search for UseDefs and DefUses. Since each operand has one def search for UseDefs is straightforward. But since operand can have many uses, we have to check all uses of each copy we traverse during search for DefUses. Differential Revision: https://reviews.llvm.org/D64486 llvm-svn: 365744
* [MIPS GlobalISel] RegBankSelect for chains of ambiguous instructionsPetar Avramovic2019-07-112-14/+77
| | | | | | | | | | | | | | | | | | | | When one of the uses/defs of ambiguous instruction is also ambiguous visit it recursively and search its uses/defs for instruction with only one mapping available. When all instruction in a chain are ambiguous arbitrary mapping can be selected. For s64 operands in ambiguous chain fprb is selected since it results in less instructions then having to narrow scalar s64 to s32. For s32 both gprb and fprb result in same number of instructions and gprb is selected like a general purpose option. At the moment we always avoid cross register bank copies. TODO: Implement a model for costs calculations of different mappings on same instruction and cross bank copies. Allow cross bank copies when appropriate according to cost model. Differential Revision: https://reviews.llvm.org/D64485 llvm-svn: 365743
* Remove some redundant code from r290372 and improve a comment.Jay Foad2019-07-111-5/+3
| | | | llvm-svn: 365741
* [ARM][ParallelDSP] Change the search for smladsSam Parker2019-07-111-252/+316
| | | | | | | | | | | | | | | | Two functional changes have been made here: - Now search up from any add instruction to find the chains of operations that we may turn into a smlad. This allows the generation of a smlad which doesn't accumulate into a phi. - The search function has been corrected to stop it falsely searching up through an invalid path. The bulk of the changes have been making the Reduction struct a class and making it more C++y with getters and setters. Differential Revision: https://reviews.llvm.org/D61780 llvm-svn: 365740
* [WebAssembly] Print error message for llvm.clear_cache intrinsicHeejin Ahn2019-07-111-0/+4
| | | | | | | | | | | | | | | | Summary: Wasm does not currently support `llvm.clear_cache` intrinsic, and this prints a proper error message instead of segfault. Reviewers: dschuff, sbc100, sunfish Subscribers: jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64322 llvm-svn: 365731
* [X86] Don't convert 8 or 16 bit ADDs to LEAs on Atom in FixupLEAPass.Craig Topper2019-07-111-12/+15
| | | | | | | | | | | | | | | | | We use the functions that convert to three address to do the conversion, but changing an 8 or 16 bit will cause it to create a virtual register. This can't be done after register allocation where this pass runs. I've switched the pass completely to a white list of instructions that can be converted to LEA instead of a blacklist that was incorrect. This will avoid surprises if we enhance the three address conversion function to include additional instructions in the future. Fixes PR42565. llvm-svn: 365720
* [AMDGPU] gfx908 atomic fadd and atomic pk_faddStanislav Mekhanoshin2019-07-117-4/+195
| | | | | | Differential Revision: https://reviews.llvm.org/D64435 llvm-svn: 365717
* [AMDGPU] gfx908 dot instruction supportStanislav Mekhanoshin2019-07-111-0/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D64431 llvm-svn: 365715
* [X86] Add patterns with and_flag_nocf for BLSI and TBM instructions.Craig Topper2019-07-101-6/+19
| | | | | | Fixes similar issues to r352306. llvm-svn: 365705
* [X86] Add BLSR and BLSMSK to isUseDefConvertible.Craig Topper2019-07-101-1/+6
| | | | | | | | | | | | Unfortunately subo formation in CGP prevents obvious ways of testing this. But we already have BLSI in here and the flag behavior is well understood. Might become more useful if we improve PR42571. llvm-svn: 365702
* [NFC]Fix IR/MC depency issue for function descriptor SDAG implementationDavid Tenty2019-07-101-44/+35
| | | | | | | | | | | | | | | | | | Summary: llvm/IR/GlobalValue.h can't be included in MC, that creates a circular dependency between MC and IR libraries. This circular dependency is causing an issue for build system that enforce layering. Author: Xiangling_L Reviewers: sfertile, jasonliu, hubert.reinterpretcast, gribozavr Reviewed By: gribozavr Subscribers: wuzish, nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64445 llvm-svn: 365701
* [X86] Remove unused variable. NFCCraig Topper2019-07-101-1/+0
| | | | llvm-svn: 365697
* [AArch64][GlobalISel] Optimize compare and branch cases with G_INTTOPTR and ↵Amara Emerson2019-07-101-4/+15
| | | | | | | | | | | | | | | | | | | | unknown values. Since we have distinct types for pointers and scalars, G_INTTOPTRs can sometimes obstruct attempts to find constant source values. These usually come about when try to do some kind of null pointer check. Teaching getConstantVRegValWithLookThrough about this operation allows the CBZ/CBNZ optimization to catch more cases. This change also improves the case where we can't find a constant source at all. Previously we would emit a cmp, cset and tbnz for that. Now we try to just emit a cmp and conditional branch, saving an instruction. The cumulative code size improvement of this change plus D64354 is 5.5% geomean on arm64 CTMark -O0. Differential Revision: https://reviews.llvm.org/D64377 llvm-svn: 365690
* [GlobalISel][AArch64] Use getOpcodeDef instead of findMIFromRegJessica Paquette2019-07-101-14/+3
| | | | | | | | | | | | | | | | Some minor cleanup. This function in Utils does the same thing as `findMIFromReg`. It also looks through copies, which `findMIFromReg` didn't. Delete `findMIFromReg` and use `getOpcodeDef` instead. This only happens in `tryOptVectorDup` right now. Update opt-shuffle-splat to show that we can look through the copies now, too. Differential Revision: https://reviews.llvm.org/D64520 llvm-svn: 365684
* [GlobalISel][AArch64][NFC] Use getDefIgnoringCopies from Utils where we canJessica Paquette2019-07-101-22/+5
| | | | | | | | | | | | | | | | | There are a few places where we walk over copies throughout AArch64InstructionSelector.cpp. In Utils, there's a function that does exactly this which we can use instead. Note that the utility function works with the case where we run into a COPY from a physical register. We've run into bugs with this a couple times, so using it should defend us from similar future bugs. Also update opt-fold-compare.mir to show that we still handle physical registers properly. Differential Revision: https://reviews.llvm.org/D64513 llvm-svn: 365683
* Revert "[System Model] [TTI] Update cache and prefetch TTI interfaces"David Greene2019-07-107-13/+37
| | | | | | | | This broke some PPC prefetching tests. This reverts commit 9fdfb045ae8bb643ab0d0455dcf9ecaea3b1eb3c. llvm-svn: 365680
* [System Model] [TTI] Update cache and prefetch TTI interfacesDavid Greene2019-07-107-37/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the TTI cache and software prefetching APIs to prepare for the introduction of a general system model. Changes include: - Marking existing interfaces const and/or override as appropriate - Adding comments - Adding BasicTTIImpl interfaces that delegate to a subtarget implementation - Adding a default "no information" subtarget implementation Only a handful of targets use these interfaces currently: AArch64, Hexagon, PPC and SystemZ. AArch64 already has a custom subtarget implementation, so its custom TTI implementation is migrated to use the new facilities in BasicTTIImpl to invoke its custom subtarget implementation. The custom TTI implementations continue to exist for the other targets with this change. They are not moved over to subtarget-based implementations. The end goal is to have the default subtarget implementation defer to the system model defined by the target. With this change, the default subtarget implementation essentially returns "no information" for these interfaces. None of the existing users of TTI will hit that implementation because they define their own custom TTI implementations and won't use the BasicTTIImpl implementations. Once system models are in place for the targets that use these interfaces, their custom TTI implementations can be removed. Differential Revision: https://reviews.llvm.org/D63614 llvm-svn: 365676
* [X86] EltsFromConsecutiveLoads - clean up element size calcs. NFCI.Simon Pilgrim2019-07-101-14/+12
| | | | | | Determine the element/load size calculations earlier and assert that they are whole bytes in size. llvm-svn: 365674
* MC: AArch64: Add support for pg_hi21_nc relocation specifier.Peter Collingbourne2019-07-101-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D64455 llvm-svn: 365661
* GlobalISel: Legalization for G_FMINNUM/G_FMAXNUMMatt Arsenault2019-07-102-1/+57
| | | | llvm-svn: 365658
* [X86] EltsFromConsecutiveLoads - remove duplicate check for element size. NFCI.Simon Pilgrim2019-07-101-6/+0
| | | | | | We've already checked that each element is the correct contributory size for VT when we inspect the elements for Undef/Zero/Load. llvm-svn: 365656
* [X86] EltsFromConsecutiveLoads - ensure element reg/store sizes are the same ↵Simon Pilgrim2019-07-101-3/+5
| | | | | | | | size. NFCI. This renames the type so it doesn't sound like its based off the load size - as we're moving towards supporting combining loads of different sizes. llvm-svn: 365655
* AMDGPU: Serialize mode from MachineFunctionInfoMatt Arsenault2019-07-103-1/+32
| | | | llvm-svn: 365653
* [AMDGPU] Allow abs/neg source modifiers on v_cndmask_b32Jay Foad2019-07-101-7/+8
| | | | | | | | | | | | | | | | | Summary: D59191 added support for these modifiers in the assembler and disassembler. This patch just teaches instruction selection that it can use them. Reviewers: arsenm, tstellar Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64497 llvm-svn: 365640
* [X86] EltsFromConsecutiveLoads - cleanup Zero/Undef/Load element collection. ↵Simon Pilgrim2019-07-101-12/+17
| | | | | | NFCI. llvm-svn: 365628
* [MIPS GlobalISel] Select float and double phiPetar Avramovic2019-07-101-4/+25
| | | | | | | | Select float and double phi for MIPS32. Differential Revision: https://reviews.llvm.org/D64420 llvm-svn: 365627
* [MIPS GlobalISel] Select float and double load and storePetar Avramovic2019-07-101-22/+44
| | | | | | | | Select float and double load and store for MIPS32. Differential Revision: https://reviews.llvm.org/D64419 llvm-svn: 365626
* [NFC][ARM] Convert lambdas to static helpersSam Parker2019-07-101-57/+73
| | | | | | | Break up and convert some of the lambdas in ARMLowOverheadLoops into static functions. llvm-svn: 365623
* [X86] EltsFromConsecutiveLoads - LDBase is non-null. NFCI.Simon Pilgrim2019-07-101-6/+4
| | | | | | Don't bother checking for LDBase != null - it should be (and we assert that it is). llvm-svn: 365622
* [X86] EltsFromConsecutiveLoads - store Loads on a per-element basis. NFCI.Simon Pilgrim2019-07-101-9/+9
| | | | | | Cache the LoadSDNode nodes so we can easily map to/from the element index instead of packing them together - this will be useful for future patches for PR16739 etc. llvm-svn: 365620
* [X86][SSE] EltsFromConsecutiveLoads - add basic dereferenceable supportSimon Pilgrim2019-07-101-7/+15
| | | | | | | | | | This patch checks to see if the vector element loads are based off a dereferenceable pointer that covers the entire vector width, in which case we don't need to have element loads at both extremes of the vector width - just the start (base pointer) of it. Another step towards partial vector loads...... Differential Revision: https://reviews.llvm.org/D64205 llvm-svn: 365614
* Fix "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-07-101-1/+1
| | | | llvm-svn: 365612
* [ARM] Enable VPUSH/VPOP aliases when either MVE or VFP is presentMikhail Maltsev2019-07-102-5/+5
| | | | | | | | | | | | | | | | | | | | Summary: Use the same predicates as VSTMDB/VLDMIA since VPUSH/VPOP alias to these. Patch by Momchil Velikov. Reviewers: ostannard, simon_tatham, SjoerdMeijer, samparker, t.p.northover, dmgreen Reviewed By: dmgreen Subscribers: javed.absar, kristof.beyls, hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64413 llvm-svn: 365604
* [X86] Limit getTargetConstantFromNode to only work on NormalLoads not ↵Craig Topper2019-07-101-1/+1
| | | | | | | | | extending loads. This seems to fix a failure reported by Jordan Rupprecht, but we don't have a reduced test case yet. llvm-svn: 365589
* AMDGPU/GlobalISel: Add support for wide loads >= 256-bitsTom Stellard2019-07-104-37/+219
| | | | | | | | | | | | | | | | | | Summary: This adds support for the most commonly used wide load types: <8xi32>, <16xi32>, <4xi64>, and <8xi64> Reviewers: arsenm Reviewed By: arsenm Subscribers: hiraditya, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57399 llvm-svn: 365586
* GlobalISel: Implement lower for G_FCOPYSIGNMatt Arsenault2019-07-091-3/+2
| | | | | | | | | In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
* [X86] Don't form extloads in combineExtInVec unless the load extension is legal.Craig Topper2019-07-091-7/+9
| | | | | | | | | | This should prevent doing this on pre-sse4.1 targets or for 256 bit vectors without avx2. I don't know of a failure from this. Op legalization will probably take care of, but seemed better to be safe. llvm-svn: 365577
* AMDGPU/GlobalISel: Fix legality for G_BUILD_VECTORMatt Arsenault2019-07-091-7/+4
| | | | llvm-svn: 365575
* [AMDGPU] gfx908 v_pk_fmac_f16 supportStanislav Mekhanoshin2019-07-092-4/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D64433 llvm-svn: 365573
OpenPOWER on IntegriCloud