summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Expand setcc for v2f32 and v4f32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314853
* AMDGPU: Expand setcc for v2i32 and v4i32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314852
* [X86] Remove dead declaration convertArgMovsToPushes, NFCReid Kleckner2017-10-031-9/+0
| | | | | | | | This was dead when it landed in r252578. We have this functionality, if not for stack probe calls, but for regular calls in X86CallFrameOptimization.cpp. llvm-svn: 314845
* [PowerPC] Revert P9 scheduling model to incompleteStefan Pintilie2017-10-031-1/+1
| | | | | | | Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026 The previous change caused regressions on Power 9. llvm-svn: 314835
* [AMDGPU] implemented pal metadataTim Renouf2017-10-036-3/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For the amdpal OS type: We write an AMDGPU_PAL_METADATA record in the .note section in the ELF (or as an assembler directive). It contains key=value pairs of 32 bit ints. It is a merge of metadata from codegen of the shaders, and metadata provided by the frontend as _amdgpu_pal_metadata IR metadata. Where both sources have a key=value with the same key, the two values are ORed together. This .note record is part of the amdpal ABI and will be documented in docs/AMDGPUUsage.rst in a future commit. Eventually the amdpal OS type will stop generating the .AMDGPU.config section once the frontend has safely moved over to using the .note records above instead of .AMDGPU.config. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37753 llvm-svn: 314829
* [AMDGPU] Avoid predicated execution of the basic blocks containing scalarAlexander Timofeev2017-10-031-0/+10
| | | | | | | | instructions. Differential revision: https://reviews.llvm.org/D38293 llvm-svn: 314828
* CodeView: Provide a .def file with the register idsHans Wennborg2017-10-031-45/+122
| | | | | | | | | | | | | | The list of register ids was previously written out in a couple of dirrent places. This puts it in a .def file and also adds a few more registers (e.g. the x87 regs) which should lead to more readable dumps, but I didn't include the whole list since that seems unnecessary. X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not relying on magic constants anymore. The TODO of using tablegen still stands. Differential revision: https://reviews.llvm.org/D38480 llvm-svn: 314821
* [ARM] Use table-gen'd assembly operand diags in ARM asm parserOliver Stannard2017-10-033-95/+22
| | | | | | | | | | | | This switches the ARM AsmParser to use assembly operand diagnostics from tablegen, rather than a switch statement on the ARMMatchResultTy. It moves the existing diagnostic strings to tablegen, but adds no new ones, so this is NFC except for one diagnostic string that had an off-by-1 error in the hand-written switch statement. Differential revision: https://reviews.llvm.org/D31607 llvm-svn: 314804
* [ARM, Asm] Use correct source location for register tokensOliver Stannard2017-10-031-3/+3
| | | | | | | | | | | | tryParseRegister advances the lexer, so we need to take copies of the start and end locations of the register operand before calling it. Previously, the caret in the diagnostic pointer to the comma after the r0 operand in the test, rather than the start of the operand. Differential revision: https://reviews.llvm.org/D31537 llvm-svn: 314799
* [mips] Enable spilling and reloading of the dsp register set.Simon Dardis2017-10-033-0/+17
| | | | | | | | | | | The dsp register class is an alias of the gpr register class, so we have to define instructions for spilling and reloading. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D38038 llvm-svn: 314798
* [ARM, Asm] Fix ubsan failure caused by out-of-range enum valueOliver Stannard2017-10-031-2/+2
| | | | | | | | In this code, we use ~0U as a sentinel value for any operand class that doesn't have a user-friendly error message, but this value isn't in range of the MatchClassKind enum, so we need to ensure it does not get passed to isSubclass. llvm-svn: 314793
* [X86][SSE] Add support for decoding PACKSS/PACKUS shuffles masks with UNDEFSimon Pilgrim2017-10-031-4/+12
| | | | llvm-svn: 314792
* [ARM, Asm] Remove dead code causing MSan failure.Oliver Stannard2017-10-031-7/+0
| | | | | | | r314779 caused ErrorInfo to be red uninitialised, but also made this code dead, so it can just be removed. llvm-svn: 314791
* [X86][SSE] Add support for lowering shuffles to PACKSS/PACKUSSimon Pilgrim2017-10-031-0/+53
| | | | | | | | | | If the upper bits of a truncation shuffle patterns have at least the minimum number of sign/zero bits on their inputs then we can safely use PACKSS/PACKUS as shuffles. Partial fix for https://bugs.llvm.org/show_bug.cgi?id=34773 Differential Revision: https://reviews.llvm.org/D38472 llvm-svn: 314788
* [ARM] Use new assembler diags for ARMOliver Stannard2017-10-032-133/+289
| | | | | | | | | | | | | | | This converts the ARM AsmParser to use the new assembly matcher error reporting mechanism, which allows errors to be reported for multiple instruction encodings when it is ambiguous which one the user intended to use. By itself this doesn't improve many error messages, because we don't have diagnostic text for most operand types, but as we add that then this will allow more of those diagnostic strings to be used when they are relevant. Differential revision: https://reviews.llvm.org/D31530 llvm-svn: 314779
* Remove unused variable. NFCI.Simon Pilgrim2017-10-031-1/+0
| | | | llvm-svn: 314778
* [X86][SSE] Add support for shuffle combining from PACKSS/PACKUSSimon Pilgrim2017-10-031-0/+4
| | | | | | Mentioned in D38472 llvm-svn: 314777
* [X86][SSE] Add support for PACKSS/PACKUS constant foldingSimon Pilgrim2017-10-031-0/+85
| | | | | | Pulled out of D38472 llvm-svn: 314776
* ISel type legalization: add debug messages. NFCI.Sjoerd Meijer2017-10-031-0/+1
| | | | | | | | | | This adds some more debug messages to the type legalizer and functions like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make the debug messages a little bit more informative and useful. Differential Revision: https://reviews.llvm.org/D38450 llvm-svn: 314773
* [trivial] fix format, NFCHiroshi Inoue2017-10-031-1/+1
| | | | llvm-svn: 314769
* [X86] Provide the LSDA pointer with RIP relative addressing if necessaryMartin Storsjo2017-10-032-5/+7
| | | | | | | | | | | | | This makes sure the LSDA pointer isn't truncated to 32 bit. Make LowerINTRINSIC_WO_CHAIN a member function instead of a static function, so that it can use the getGlobalWrapperKind method. This solves the second half of the issues mentioned in PR34720. Differential Revision: https://reviews.llvm.org/D38343 llvm-svn: 314767
* AMDGPU: Remove global isGCN predicatesMatt Arsenault2017-10-0319-442/+466
| | | | | | | | | | | | | | These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
* [X86][NFC] Add X86CmovConverterPass to the pass registry.Amjad Aboud2017-10-022-5/+20
| | | | | | Differential Revision: https://reviews.llvm.org/D38355 llvm-svn: 314726
* Remove dead file.Michael Liao2017-10-021-1479/+0
| | | | llvm-svn: 314720
* AMDGPU: Fix typosMatt Arsenault2017-10-021-2/+2
| | | | llvm-svn: 314715
* Add support for Myriad ma2x8x series of CPUsWalter Lee2017-10-021-0/+9
| | | | | | | | | | | | Summary: Also add support for some older Myriad CPUs that were missing. Reviewers: jyknight Subscribers: fedor.sergeev Differential Revision: https://reviews.llvm.org/D37552 llvm-svn: 314705
* [X86][SSE] Fix -Wsign-compare problems introduced in r314658Bjorn Pettersson2017-10-021-4/+4
| | | | | | | | | | | The refactoring in "[X86][SSE] Add createPackShuffleMask helper function. NFCI." resulted in warning when compiling the code (seen in build bots). This patch restores some types from int to unsigned to avoid those warnings. llvm-svn: 314667
* [X86][SSE] Add createPackShuffleMask helper function. NFCI.Simon Pilgrim2017-10-021-10/+19
| | | | llvm-svn: 314658
* [X86][SSE] matchBinaryVectorShuffle - add support for different src/dst ↵Simon Pilgrim2017-10-021-12/+12
| | | | | | | | value shuffle types Preparation for support for combining to PACKSS/PACKUS llvm-svn: 314656
* [PowerPC] support ZERO_EXTEND in tryBitPermutationHiroshi Inoue2017-10-021-17/+64
| | | | | | | | | | | | | | | | | | | This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI. Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction. For example, we allow these nodes t9: i32 = add t7, Constant:i32<1> t11: i32 = and t9, Constant:i32<255> t12: i64 = zero_extend t11 t14: i64 = shl t12, Constant:i64<2> to be folded into a rotate-and-mask instruction. Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF]; Differential Revision: https://reviews.llvm.org/D37514 llvm-svn: 314655
* Fix typo in comment. NFCI.Simon Pilgrim2017-10-021-1/+1
| | | | llvm-svn: 314653
* [X86] Cleanup uses of computeKnownBits by using MaskedValueIsZero helper ↵Simon Pilgrim2017-10-021-6/+3
| | | | | | instead. NFCI. llvm-svn: 314652
* [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in ↵Michael Zuckerman2017-10-021-109/+169
| | | | | | | | | | | | | | | | | | | X86InterleavedAccess (VF64 stride 3-4) I continue to support different VF interleaved and in this pass for this patch, I added the vf64 stride3 support for both load and store. I also added support fot the stride4 store. Reviewers: 1. zvi 2. dorit 3. igorb 4. guyblank Differential Revision: https://reviews.llvm.org/D37687 Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b llvm-svn: 314651
* [X86] Fix copy pasto in X86FastISel::fastEmitInst_rrrr.Craig Topper2017-10-021-1/+1
| | | | | | The 4th operand was not being constrained and the third operand was being constrained twice. llvm-svn: 314648
* [X86] Use a bool flag instead of assigning an unsigned to two different ↵Craig Topper2017-10-021-9/+8
| | | | | | values that we only use in an equality comparison. llvm-svn: 314647
* [X86] Use _NOREX MOVZX instructions for some patterns even in 32-bit mode.Craig Topper2017-10-021-32/+6
| | | | | | This unifies the patterns between both modes. This should be effectively NFC since all the available registers in 32-bit mode statisfy this constraint. llvm-svn: 314643
* [Hexagon] Check vector elements for equivalence in the ↵Ron Lieberman2017-10-021-1/+16
| | | | | | | | | | | | | HexagonVectorLoopCarriedReuse pass If the two instructions being compared for equivalence have corresponding operands that are integer constants, then check their values to determine equivalence. Patch by Suyog Sarda! llvm-svn: 314642
* [Hexagon] Patch to Extract i1 element from vector of i1Ron Lieberman2017-10-021-1/+7
| | | | | | | This patch extracts 1 element from vector consisting of elements of size 1 bit at given index. llvm-svn: 314641
* [X86] Change register&memory TEST instructions from MRMSrcMem to MRMDstMemCraig Topper2017-10-018-34/+33
| | | | | | | | | | | | | | | | | | | Summary: Intel documentation shows the memory operand as the first operand. But we currently treat it as the second operand. Conceptually the order doesn't matter since it doesn't write memory. We have aliases to parse with the operands in either order and the isel matching is commutable. For the register&register form order does matter for the assembly parser. PR22995 was previously filed and fixed by changing the register&register form from MRMSrcReg to MRMDestReg to match gas. Ideally the memory form should match by using MRMDestMem. I believe this supercedes D38025 which was trying to switch the register&register form back to pre-PR22995. Reviewers: aymanmus, RKSimon, zvi Reviewed By: aymanmus Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38120 llvm-svn: 314639
* [X86] Remove a couple unnecessary COPY_TO_REGCLASS from some output patterns ↵Craig Topper2017-10-011-9/+7
| | | | | | where the instruction already produces the correct register class. llvm-svn: 314638
* [X86][SSE] Add faux shuffle combining support for PACKUSSimon Pilgrim2017-10-011-4/+15
| | | | llvm-svn: 314631
* [X86][SSE] Improve shuffle combining of PACKSS instructions.Simon Pilgrim2017-10-011-6/+24
| | | | | | Support unary packing and fix the faux shuffle mask for vectors larger than 128 bits. llvm-svn: 314629
* [x86] formatting; NFCSanjay Patel2017-10-011-4/+2
| | | | llvm-svn: 314627
* [X86][SSE] Fold (VSRAI (VSHLI X, C1), C1) --> X iff NumSignBits(X) > C1Simon Pilgrim2017-09-301-0/+9
| | | | | | Remove sign extend in register style pattern if the sign is already extended enough llvm-svn: 314599
* [AVX-512] Add patterns to make fp compare instructions commutable during isel.Craig Topper2017-09-302-1/+90
| | | | llvm-svn: 314598
* Code refactoring for the interleaved code <NFC>Michael Zuckerman2017-09-301-28/+18
| | | | | Change-Id: I7831c9febad8e14278a5bc87584a0053dc837be1 llvm-svn: 314596
* [X86] Support v64i8 mulhu/mulhsCraig Topper2017-09-301-1/+9
| | | | | | | | Implemented by splitting into two v32i8 mulhu/mulhs and concatenating the results. Differential Revision: https://reviews.llvm.org/D38307 llvm-svn: 314584
* [AMDGPU] Set fast-math flags on functions given the optionsStanislav Mekhanoshin2017-09-293-7/+36
| | | | | | | | | | | | | | | | We have a single library build without relaxation options. When inlined library functions remove fast math attributes from the functions they are integrated into. This patch sets relaxation attributes on the functions after linking provided corresponding relaxation options are given. Math instructions inside the inlined functions remain to have no fast flags, but inlining does not prevent fast math transformations of a surrounding caller code anymore. Differential Revision: https://reviews.llvm.org/D38325 llvm-svn: 314568
* AMDGPU: VALU carry-in and v_cndmask condition cannot be EXECNicolai Haehnle2017-09-295-11/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | The hardware will only forward EXEC_LO; the high 32 bits will be zero. Additionally, inline constants do not work. At least, v_addc_u32_e64 v0, vcc, v0, v1, -1 which could conceivably be used to combine (v0 + v1 + 1) into a single instruction, acts as if all carry-in bits are zero. The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine s_mov_b64 s[0:1], exec v_cndmask_b32_e64 v0, v1, v2, s[0:1] into v_mov_b32 v0, v3 but it's not particularly high priority. Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.* llvm-svn: 314522
* [SystemZ] implement shouldCoalesce()Jonas Paulsson2017-09-296-4/+90
| | | | | | | | | | | | | | | | | | | Implement shouldCoalesce() to help regalloc avoid running out of GR128 registers. If a COPY involving a subreg of a GR128 is coalesced, the live range of the GR128 virtual register will be extended. If this happens where there are enough phys-reg clobbers present, regalloc will run out of registers (if there is not a single GR128 allocatable register available). This patch tries to allow coalescing only when it can prove that this will be safe by checking the (local) interval in question. Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D37899 https://bugs.llvm.org/show_bug.cgi?id=34610 llvm-svn: 314516
OpenPOWER on IntegriCloud