summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Redefine MOVSS/MOVSD instructions to take VR128 regclass as input ↵Craig Topper2017-10-043-77/+73
| | | | | | | | | | | | | | instead of FR32/FR64 This patch redefines the MOVSS/MOVSD instructions to take VR128 as its second input. This allows the MOVSS/SD->BLEND commute to work without requiring a COPY to be inserted. This should fix PR33079 Overall this looks to be an improvement in the generated code. I haven't checked the EXPENSIVE_CHECKS build but I'll do that and update with results. Differential Revision: https://reviews.llvm.org/D38449 llvm-svn: 314914
* bpf: fix an insn encoding issue for neg insnYonghong Song2017-10-041-2/+0
| | | | | Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 314911
* [X86][SSE] Early out from ComputeNumSignBitsForTargetNode. NFCI.Simon Pilgrim2017-10-041-2/+6
| | | | | | Early out from vector shift by immediates that will exceed eltsize - don't bother making an unnecessary ComputeNumSignBits recursive call. llvm-svn: 314903
* [X86][SSE] Add support for lowering unary shuffles to PACKSS/PACKUSSimon Pilgrim2017-10-041-12/+24
| | | | | | Extension to D38472 llvm-svn: 314901
* [AVR] Implement LPMWRdZ pseudo-instruction's expansion.Dylan McKay2017-10-041-1/+44
| | | | | | | | | FIXME: implementation is mostly copy-pasted from LDWRdPtr, so we should refactor a bit and unify the two Patch by Gerdo Erdi. llvm-svn: 314898
* [AVR] Factor out mayLoad in tablegen patternsDylan McKay2017-10-041-2/+2
| | | | | | Patch by Gergo Erdi. llvm-svn: 314897
* [AVR] Elaborate LDWRdPtr into `ld r, X++; ld r+1, X`Dylan McKay2017-10-042-7/+7
| | | | | | Patch by Gergo Erdi. llvm-svn: 314896
* [AVR] Insert JMP for long branchesDylan McKay2017-10-042-2/+22
| | | | | | | | | | | Previously, on long branches (relative jumps of >4 kB), an assertion failure was hit, as AVRInstrInfo::insertIndirectBranch was not implemented. Despite its name, it is called by the branch relaxator for *all* unconditional jumps. Patch by Thomas Backman. llvm-svn: 314891
* [AVR] Fix displacement overflow for LDDW/STDWDylan McKay2017-10-042-5/+13
| | | | | | | | | | | | | | | | | | | In some cases, the code generator attempts to generate instructions such as: lddw r24, Y+63 which expands to: ldd r24, Y+63 ldd r25, Y+64 # Oops! This is actually ld r25, Y in the binary This commit limits the first offset to 62, and thus the second to 63. It also updates some asserts in AVRExpandPseudoInsts.cpp, including for INW and OUTW, which appear to be unused. Patch by Thomas Backman. llvm-svn: 314890
* [ARM] Add diag string for movw/movt immediates in assemblyOliver Stannard2017-10-041-0/+1
| | | | | | | | | This adds diagnostics for invalid immediate operands to the MOVW and MOVT instructions (ARM and Thumb). Differential revision: https://reviews.llvm.org/D31879 llvm-svn: 314888
* [ARM, Asm] Change grammar of immediate operand diagnosticsOliver Stannard2017-10-041-3/+3
| | | | | | | | | | | | | | | | | Currently, our diagnostics for assembly operands are not consistent. Some start with (for example) "immediate operand must be ...", and some with "operand must be an immediate ...". I think the latter form is preferable for a few reasons: * It's unambiguous that it is referring to the expected type of operand, not the type the user provided. For example, the user could provide an register operand, and get a message taking about an operand is if it is already an immediate, just not in the accepted range. * It allows us to have a consistent style once we add diagnostics for operands that could take two forms, for example a label or pc-relative memory operand. Differential revision: https://reviews.llvm.org/D36689 llvm-svn: 314887
* [X86] Improvement in CodeGen instruction selection for LEAs (re-applying ↵Jatin Bhateja2017-10-042-10/+495
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | post required revision changes.) Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG. e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will no look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out. e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy Reviewed By: lsaba Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 314886
* [X86] Fix using the SJLJ jump table on x86_64Martin Storsjo2017-10-041-10/+65
| | | | | | | | | | | | | | | | The previous version didn't work if the jump table base address didn't fit in 32 bit, since it was encoded as an immediate offset. And in case the jump table is encoded as 32 bit label differences, we need to load and add them to the table base first. This solves the first half of the issues mentioned in PR34720. Also fix some of the errors pointed out by -verify-machineinstrs, by using GR32_NOSPRegClass. Differential Revision: https://reviews.llvm.org/D38333 llvm-svn: 314876
* [AArch64] Use LateSimplifyCFG after expanding atomic operations.Balaram Makam2017-10-031-1/+1
| | | | | | | | | | | | | | | | | Summary: After r308422 we defer optimizations that can destroy loop canonical forms to LateSimplifyCFG. Running LateSimplifyCFG after expanding atomic operations can exploit more control-flow opportunities. Reviewers: mcrosier, t.p.northover, efriedma Reviewed By: efriedma Subscribers: aemerson, rengolin, javed.absar, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D38262 llvm-svn: 314857
* AMDGPU: Expand setcc for v2f32 and v4f32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314853
* AMDGPU: Expand setcc for v2i32 and v4i32Konstantin Zhuravlyov2017-10-031-0/+1
| | | | llvm-svn: 314852
* [X86] Remove dead declaration convertArgMovsToPushes, NFCReid Kleckner2017-10-031-9/+0
| | | | | | | | This was dead when it landed in r252578. We have this functionality, if not for stack probe calls, but for regular calls in X86CallFrameOptimization.cpp. llvm-svn: 314845
* [PowerPC] Revert P9 scheduling model to incompleteStefan Pintilie2017-10-031-1/+1
| | | | | | | Partially revert a previous change from commit: https://llvm.org/svn/llvm-project/llvm/trunk@314026 The previous change caused regressions on Power 9. llvm-svn: 314835
* [AMDGPU] implemented pal metadataTim Renouf2017-10-036-3/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For the amdpal OS type: We write an AMDGPU_PAL_METADATA record in the .note section in the ELF (or as an assembler directive). It contains key=value pairs of 32 bit ints. It is a merge of metadata from codegen of the shaders, and metadata provided by the frontend as _amdgpu_pal_metadata IR metadata. Where both sources have a key=value with the same key, the two values are ORed together. This .note record is part of the amdpal ABI and will be documented in docs/AMDGPUUsage.rst in a future commit. Eventually the amdpal OS type will stop generating the .AMDGPU.config section once the frontend has safely moved over to using the .note records above instead of .AMDGPU.config. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D37753 llvm-svn: 314829
* [AMDGPU] Avoid predicated execution of the basic blocks containing scalarAlexander Timofeev2017-10-031-0/+10
| | | | | | | | instructions. Differential revision: https://reviews.llvm.org/D38293 llvm-svn: 314828
* CodeView: Provide a .def file with the register idsHans Wennborg2017-10-031-45/+122
| | | | | | | | | | | | | | The list of register ids was previously written out in a couple of dirrent places. This puts it in a .def file and also adds a few more registers (e.g. the x87 regs) which should lead to more readable dumps, but I didn't include the whole list since that seems unnecessary. X86_MC::initLLVMToSEHAndCVRegMapping is pretty ugly, but at least it's not relying on magic constants anymore. The TODO of using tablegen still stands. Differential revision: https://reviews.llvm.org/D38480 llvm-svn: 314821
* [ARM] Use table-gen'd assembly operand diags in ARM asm parserOliver Stannard2017-10-033-95/+22
| | | | | | | | | | | | This switches the ARM AsmParser to use assembly operand diagnostics from tablegen, rather than a switch statement on the ARMMatchResultTy. It moves the existing diagnostic strings to tablegen, but adds no new ones, so this is NFC except for one diagnostic string that had an off-by-1 error in the hand-written switch statement. Differential revision: https://reviews.llvm.org/D31607 llvm-svn: 314804
* [ARM, Asm] Use correct source location for register tokensOliver Stannard2017-10-031-3/+3
| | | | | | | | | | | | tryParseRegister advances the lexer, so we need to take copies of the start and end locations of the register operand before calling it. Previously, the caret in the diagnostic pointer to the comma after the r0 operand in the test, rather than the start of the operand. Differential revision: https://reviews.llvm.org/D31537 llvm-svn: 314799
* [mips] Enable spilling and reloading of the dsp register set.Simon Dardis2017-10-033-0/+17
| | | | | | | | | | | The dsp register class is an alias of the gpr register class, so we have to define instructions for spilling and reloading. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D38038 llvm-svn: 314798
* [ARM, Asm] Fix ubsan failure caused by out-of-range enum valueOliver Stannard2017-10-031-2/+2
| | | | | | | | In this code, we use ~0U as a sentinel value for any operand class that doesn't have a user-friendly error message, but this value isn't in range of the MatchClassKind enum, so we need to ensure it does not get passed to isSubclass. llvm-svn: 314793
* [X86][SSE] Add support for decoding PACKSS/PACKUS shuffles masks with UNDEFSimon Pilgrim2017-10-031-4/+12
| | | | llvm-svn: 314792
* [ARM, Asm] Remove dead code causing MSan failure.Oliver Stannard2017-10-031-7/+0
| | | | | | | r314779 caused ErrorInfo to be red uninitialised, but also made this code dead, so it can just be removed. llvm-svn: 314791
* [X86][SSE] Add support for lowering shuffles to PACKSS/PACKUSSimon Pilgrim2017-10-031-0/+53
| | | | | | | | | | If the upper bits of a truncation shuffle patterns have at least the minimum number of sign/zero bits on their inputs then we can safely use PACKSS/PACKUS as shuffles. Partial fix for https://bugs.llvm.org/show_bug.cgi?id=34773 Differential Revision: https://reviews.llvm.org/D38472 llvm-svn: 314788
* [ARM] Use new assembler diags for ARMOliver Stannard2017-10-032-133/+289
| | | | | | | | | | | | | | | This converts the ARM AsmParser to use the new assembly matcher error reporting mechanism, which allows errors to be reported for multiple instruction encodings when it is ambiguous which one the user intended to use. By itself this doesn't improve many error messages, because we don't have diagnostic text for most operand types, but as we add that then this will allow more of those diagnostic strings to be used when they are relevant. Differential revision: https://reviews.llvm.org/D31530 llvm-svn: 314779
* Remove unused variable. NFCI.Simon Pilgrim2017-10-031-1/+0
| | | | llvm-svn: 314778
* [X86][SSE] Add support for shuffle combining from PACKSS/PACKUSSimon Pilgrim2017-10-031-0/+4
| | | | | | Mentioned in D38472 llvm-svn: 314777
* [X86][SSE] Add support for PACKSS/PACKUS constant foldingSimon Pilgrim2017-10-031-0/+85
| | | | | | Pulled out of D38472 llvm-svn: 314776
* ISel type legalization: add debug messages. NFCI.Sjoerd Meijer2017-10-031-0/+1
| | | | | | | | | | This adds some more debug messages to the type legalizer and functions like PromoteNode, ExpandNode, ExpandLibCall in an attempt to make the debug messages a little bit more informative and useful. Differential Revision: https://reviews.llvm.org/D38450 llvm-svn: 314773
* [trivial] fix format, NFCHiroshi Inoue2017-10-031-1/+1
| | | | llvm-svn: 314769
* [X86] Provide the LSDA pointer with RIP relative addressing if necessaryMartin Storsjo2017-10-032-5/+7
| | | | | | | | | | | | | This makes sure the LSDA pointer isn't truncated to 32 bit. Make LowerINTRINSIC_WO_CHAIN a member function instead of a static function, so that it can use the getGlobalWrapperKind method. This solves the second half of the issues mentioned in PR34720. Differential Revision: https://reviews.llvm.org/D38343 llvm-svn: 314767
* AMDGPU: Remove global isGCN predicatesMatt Arsenault2017-10-0319-442/+466
| | | | | | | | | | | | | | These are problematic because they apply to everything, and can easily clobber whatever more specific predicate you are trying to add to a function. Currently instructions use SubtargetPredicate/PredicateControl to apply this to patterns applied to an instruction definition, but not to free standing Pats. Add a wrapper around Pat so the special PredicateControls requirements can be appended to the final predicate list like how Mips does it. llvm-svn: 314742
* [X86][NFC] Add X86CmovConverterPass to the pass registry.Amjad Aboud2017-10-022-5/+20
| | | | | | Differential Revision: https://reviews.llvm.org/D38355 llvm-svn: 314726
* Remove dead file.Michael Liao2017-10-021-1479/+0
| | | | llvm-svn: 314720
* AMDGPU: Fix typosMatt Arsenault2017-10-021-2/+2
| | | | llvm-svn: 314715
* Add support for Myriad ma2x8x series of CPUsWalter Lee2017-10-021-0/+9
| | | | | | | | | | | | Summary: Also add support for some older Myriad CPUs that were missing. Reviewers: jyknight Subscribers: fedor.sergeev Differential Revision: https://reviews.llvm.org/D37552 llvm-svn: 314705
* [X86][SSE] Fix -Wsign-compare problems introduced in r314658Bjorn Pettersson2017-10-021-4/+4
| | | | | | | | | | | The refactoring in "[X86][SSE] Add createPackShuffleMask helper function. NFCI." resulted in warning when compiling the code (seen in build bots). This patch restores some types from int to unsigned to avoid those warnings. llvm-svn: 314667
* [X86][SSE] Add createPackShuffleMask helper function. NFCI.Simon Pilgrim2017-10-021-10/+19
| | | | llvm-svn: 314658
* [X86][SSE] matchBinaryVectorShuffle - add support for different src/dst ↵Simon Pilgrim2017-10-021-12/+12
| | | | | | | | value shuffle types Preparation for support for combining to PACKSS/PACKUS llvm-svn: 314656
* [PowerPC] support ZERO_EXTEND in tryBitPermutationHiroshi Inoue2017-10-021-17/+64
| | | | | | | | | | | | | | | | | | | This patch add a support of ISD::ZERO_EXTEND in PPCDAGToDAGISel::tryBitPermutation to increase the opportunity to use rotate-and-mask by reordering ZEXT and ANDI. Since tryBitPermutation stops analyzing nodes if it hits a ZEXT node while traversing SDNodes, we want to avoid ZEXT between two nodes that can be folded into a rotate-and-mask instruction. For example, we allow these nodes t9: i32 = add t7, Constant:i32<1> t11: i32 = and t9, Constant:i32<255> t12: i64 = zero_extend t11 t14: i64 = shl t12, Constant:i64<2> to be folded into a rotate-and-mask instruction. Such case often happens in array accesses with logical AND operation in the index, e.g. array[i & 0xFF]; Differential Revision: https://reviews.llvm.org/D37514 llvm-svn: 314655
* Fix typo in comment. NFCI.Simon Pilgrim2017-10-021-1/+1
| | | | llvm-svn: 314653
* [X86] Cleanup uses of computeKnownBits by using MaskedValueIsZero helper ↵Simon Pilgrim2017-10-021-6/+3
| | | | | | instead. NFCI. llvm-svn: 314652
* [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in ↵Michael Zuckerman2017-10-021-109/+169
| | | | | | | | | | | | | | | | | | | X86InterleavedAccess (VF64 stride 3-4) I continue to support different VF interleaved and in this pass for this patch, I added the vf64 stride3 support for both load and store. I also added support fot the stride4 store. Reviewers: 1. zvi 2. dorit 3. igorb 4. guyblank Differential Revision: https://reviews.llvm.org/D37687 Change-Id: I3d238efedf217d1768b348d710de1efa2f19d27b llvm-svn: 314651
* [X86] Fix copy pasto in X86FastISel::fastEmitInst_rrrr.Craig Topper2017-10-021-1/+1
| | | | | | The 4th operand was not being constrained and the third operand was being constrained twice. llvm-svn: 314648
* [X86] Use a bool flag instead of assigning an unsigned to two different ↵Craig Topper2017-10-021-9/+8
| | | | | | values that we only use in an equality comparison. llvm-svn: 314647
* [X86] Use _NOREX MOVZX instructions for some patterns even in 32-bit mode.Craig Topper2017-10-021-32/+6
| | | | | | This unifies the patterns between both modes. This should be effectively NFC since all the available registers in 32-bit mode statisfy this constraint. llvm-svn: 314643
OpenPOWER on IntegriCloud