summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] Assembler: Support DPP instructions.Sam Kolton2016-03-098-46/+350
| | | | | | | | | | | | | | | | | | | | Supprot DPP syntax as used in SP3 (except several operands syntax). Added dpp-specific operands in td-files. Added DPP flag to TSFlags to determine if instruction is dpp in InstPrinter. Support for VOP2 DPP instructions in td-files. Some tests for DPP instructions. ToDo: - VOP2bInst: - vcc is considered as operand - AsmMatcher doesn't apply mnemonic aliases when parsing operands - v_mac_f32 - v_nop - disable instructions with 64-bit operands - change dpp_ctrl assembler representation to conform sp3 Review: http://reviews.llvm.org/D17804 llvm-svn: 263008
* [AMDGPU] Assembler: Support abs() syntax.Nikolay Haustov2016-03-091-2/+19
| | | | | | | | | Support legacy SP3 abs(v1) syntax. InstPrinter still uses |v1|. Add tests. Differential Revision: http://reviews.llvm.org/D17887 llvm-svn: 263006
* [AMDGPU] Assembler: Fix s_setpc_b64Nikolay Haustov2016-03-091-1/+1
| | | | | | | | s_setpc_b64 has just one 64-bit source which is the address of instruction to jump to. Differential Revision: http://reviews.llvm.org/D17888 llvm-svn: 263005
* [WebAssembly] Update comments about irreducible control flow.Dan Gohman2016-03-092-8/+13
| | | | llvm-svn: 262995
* [WebAssembly] Implement irreducible control flow.Dan Gohman2016-03-095-35/+297
| | | | | | | | This implements a very simple conservative transformation that doesn't require more than linear code size growth. There's room for much more optimization in this space. llvm-svn: 262982
* Revert r262759 and r262760.Quentin Colombet2016-03-081-9/+0
| | | | | | | | The fix consisting in using the library call for atomic compare and swap when the instruction is not safe to use may be incorrect. Indeed the library call may not exist on all platform. In other words, we need a better fix! llvm-svn: 262943
* [AArch64] Add MMOs to unscaled pairs.Chad Rosier2016-03-081-3/+2
| | | | | | | Test to be committed in follow up commit, per discussion in D17097. http://reviews.llvm.org/D17097 llvm-svn: 262942
* [ARM] Simplify ARMInstr*.td by getting rid of identity PatFrags (NFC)Artyom Skrobov2016-03-083-107/+74
| | | | | | | | | | Reviewers: t.p.northover, grosbach, resistor Subscribers: aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D17636 llvm-svn: 262936
* Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ↵Hans Wennborg2016-03-081-24/+9
| | | | | | | | ZERO_EXTEND_VECTOR_INREG" This caused PR26870. llvm-svn: 262935
* AVX512: Add extract_subvector patterns v8i1->v4i1 , v4i1->v2i1.Igor Breger2016-03-081-0/+8
| | | | | | Differential Revision: http://reviews.llvm.org/D17953 llvm-svn: 262929
* [Power9] Implement new vsx instructions: load, store instructions for vector ↵Kit Barton2016-03-087-0/+214
| | | | | | | | | | | | | | | | | | | | and scalar We follow the comments mentioned in http://reviews.llvm.org/D16842#344378 to implement this new patch. This patch implements the following vsx instructions: Vector load/store: lxv lxvx lxvb16x lxvl lxvll lxvh8x lxvwsx stxv stxvb16x stxvh8x stxvl stxvll stxvx Scalar load/store: lxsd lxssp lxsibzx lxsihzx stxsd stxssp stxsibx stxsihx 21 instructions Phabricator: http://reviews.llvm.org/D16919 llvm-svn: 262906
* [WebAssembly] Update for spec change from tableswitch to br_table.Dan Gohman2016-03-085-18/+18
| | | | | | | Also note that the operand order changed; the default label is now listed after the regular labels. llvm-svn: 262903
* [AArch64] Initialize GlobalISel as part of the target initialization.Quentin Colombet2016-03-081-0/+2
| | | | llvm-svn: 262897
* A couple more UB fixes for C++14 sized deallocation.Richard Smith2016-03-081-0/+4
| | | | llvm-svn: 262891
* AMDGPU: Match more med3 integer patternsMatt Arsenault2016-03-072-0/+33
| | | | llvm-svn: 262864
* AMDGPU: Remove a fixme for ptrrtoint handlingMatt Arsenault2016-03-071-1/+0
| | | | llvm-svn: 262854
* AMDGPU: Move function only used by R600Matt Arsenault2016-03-074-18/+17
| | | | llvm-svn: 262853
* [ms-inline-asm][AVX512] Add ability to use k registers in MS inline asm + ↵Marina Yatsina2016-03-071-1/+11
| | | | | | | | | | | | | | | | | | | | | | fix bag with curly braces Until now curly braces could only be used in MS inline assembly to mark block start/end. All curly braces were removed completely at a very early stage. This approach caused bugs like: "m{o}v eax, ebx" turned into "mov eax, ebx" without any error. In addition, AVX-512 added special operands (e.g., k registers), which are also surrounded by curly braces that mark them as such. Now, we need to keep the curly braces and identify at a later stage if they are marking block start/end (if so, ignore them), or surrounding special AVX-512 operands (if so, parse them as such). This patch fixes the bug described above and enables the use of AVX-512 special operands. This commit is the the llvm part of the patch. The clang part of the review is: http://reviews.llvm.org/D17766 The llvm part of the review is: http://reviews.llvm.org/D17767 Differential Revision: http://reviews.llvm.org/D17767 llvm-svn: 262843
* [X86][AVX512] Fixed VPERMT2* shuffle mask decoding and enabled target ↵Simon Pilgrim2016-03-063-11/+9
| | | | | | | | | | | | | | shuffle combining. Patch to add support for target shuffle combining of X86ISD::VPERMV3 nodes, including support for detecting unary shuffles. This uncovered several issues with the X86ISD::VPERMV3 shuffle mask decoding of non-64 bit shuffle mask elements - the bit masking wasn't being correctly computed. Removed non-constant pool mask decode path as we have no way of testing it right now. Differential Revision: http://reviews.llvm.org/D17916 llvm-svn: 262809
* [AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.Valery Pykhtin2016-03-062-157/+8
| | | | | | | | Engages code from r262804. Differential Revision: http://reviews.llvm.org/D17151 llvm-svn: 262808
* fix sanitizer-ppc64be-linux failure for r262804Valery Pykhtin2016-03-061-1/+1
| | | | | | | | error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move] http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/930 llvm-svn: 262805
* [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fieldsValery Pykhtin2016-03-064-0/+369
| | | | | | Differential Revision: http://reviews.llvm.org/D17150 llvm-svn: 262804
* AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element ↵Igor Breger2016-03-062-37/+56
| | | | | | | | types on SKX Differential Revision: http://reviews.llvm.org/D17913 llvm-svn: 262803
* [AMDGPU] SOPxx instructions operand naming fixed in td files.Valery Pykhtin2016-03-063-68/+68
| | | | | | | | | dst -> sdst ssrcN -> srcN Differential Revision: http://reviews.llvm.org/D17646 llvm-svn: 262801
* [X86] Use high bits of return value from getEncoding instead of predicate ↵Craig Topper2016-03-061-162/+101
| | | | | | functions to populate the REX and VEX prefix bits that extend register encodings. NFC llvm-svn: 262800
* [X86] Remove unnecessary masking. The assert above it already guaranteed it. NFCCraig Topper2016-03-061-2/+0
| | | | llvm-svn: 262799
* [X86] Use uint8_t instead of unsigned char as it shortens the code and more ↵Craig Topper2016-03-061-27/+26
| | | | | | explicitly reflects the desired size. llvm-svn: 262798
* AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use ↵Igor Breger2016-03-062-87/+80
| | | | | | | | kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector). Differential Revision: http://reviews.llvm.org/D17763 llvm-svn: 262797
* [X86][AVX] Improved VPERMILPS variable shuffle mask decoding.Simon Pilgrim2016-03-053-1/+43
| | | | | | | | | | Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool. Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization Followup to D17681 llvm-svn: 262784
* [X86] AMD Bobcat CPU (btver1) doesn't support XSAVE Simon Pilgrim2016-03-051-1/+0
| | | | | | | | btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't support XSAVE. Differential Revision: http://reviews.llvm.org/D17683 llvm-svn: 262782
* [X86] Fix the lowering of setjmp intrinsic on i386.Quentin Colombet2016-03-051-0/+10
| | | | | | | | | | When the lowering of the setjmp intrinsic requires a global base pointer to be set, make sure such pointer gets defined by the CGBR pass. This fixes PR26742. llvm-svn: 262762
* [X86] Do not use cmpxchgXXb when we need the base pointer (RBX).Quentin Colombet2016-03-041-0/+9
| | | | | | | | | | | | cmpxchgXXb uses RBX as one of its implicit argument. I.e., when we use that instruction we need to clobber RBX. This is generally fine, expect when RBX is a reserved register because in that case, the register allocator will not track its value and will not save and restore it when interferences occur. rdar://problem/24851412 llvm-svn: 262759
* Fix build breakageDavid Majnemer2016-03-041-4/+5
| | | | llvm-svn: 262756
* [X86] Support cleaning more than 2**16 bytes of stackDavid Majnemer2016-03-046-8/+35
| | | | | | | | | | | | | | | | | | | The x86 ret instruction has a 16 bit immediate indicating how many bytes to pop off of the stack beyond the return address. There is a problem when extremely large structs are passed by value: we might not be able to fit the number of bytes to pop into the return instruction. To fix this, expand RET_FLAG a little later and use a special sequence to clean the stack: pop %ecx ; return address is now in %ecx add $n, %esp ; clean the stack push %ecx ; bring the return address back on the stack ret ; pop the return address and jmp to it's value llvm-svn: 262755
* [WebAssembly] Add another possible code-size optimization to README.txtDan Gohman2016-03-041-0/+6
| | | | llvm-svn: 262740
* [ARM] Merging 64-bit divmod lib calls into oneRenato Golin2016-03-041-0/+9
| | | | | | | | | | | | | | | | | | | | | When div+rem calls on the same arguments are found, the ARM back-end merges the two calls into one __aeabi_divmod call for up to 32-bits values. However, for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't merging the calls, and thus calling ldivmod twice and spilling the temporary results, which generated pretty bad code. This patch legalises 64-bit lib calls for divmod, so that now all the spilling and the second call are gone. It also relaxes the DivRem combiner a bit on the legal type check, since it was already checking for isLegalOrCustom on every value, so the extra check for isTypeLegal was redundant. Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make sure we only emit valid types or the ones that were explicitly marked as custom. Now, passing check-all and test-suite on x86, ARM and AArch64. This patch fixes PR17193 (and a long time FIXME in the tests). llvm-svn: 262738
* AMDGPU/SI: Add support for spiling SGPRs to scratch bufferTom Stellard2016-03-045-30/+81
| | | | | | | | | | | | | | Summary: This is necessary for when we run out of VGPRs and can no longer use v_{read,write}_lane for spilling SGPRs. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17592 llvm-svn: 262732
* AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserterTom Stellard2016-03-042-8/+16
| | | | | | | | | | | | | | | | | | | | | Summary: This allows us to use virtual registers when we need extra registers for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex(). Once all the frame indices have been eliminated, the PrologEpilogueInserter does an extra pass over the program to replace all virtual registers with physical ones. This allows us to make more efficient use of our emergency spill slots, so we only need to create one. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17591 llvm-svn: 262728
* [Hexagon] Fix lowering of calls with the return type of i1Krzysztof Parzyszek2016-03-041-10/+30
| | | | | | | This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll when run with -debug-only=isel llvm-svn: 262726
* [mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for ↵Zoran Jovanovic2016-03-042-2/+2
| | | | | | | | | | microMIPS is generated. Author: milena.vujosevic.janicic Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D17373 llvm-svn: 262725
* Test commit accessSam Kolton2016-03-041-1/+1
| | | | llvm-svn: 262714
* test commitValery Pykhtin2016-03-041-1/+1
| | | | llvm-svn: 262709
* Make headers self-contained again.Benjamin Kramer2016-03-041-0/+1
| | | | llvm-svn: 262702
* AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsicsNikolay Haustov2016-03-044-32/+169
| | | | | | | | | | | These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. Initial change by Nicolai H.hnle Differential Revision: http://reviews.llvm.org/D17401 llvm-svn: 262701
* [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.Simon Pilgrim2016-03-031-3/+3
| | | | | | PSHUFB decoder was assuming that input was 128 or 256-bit vector only. llvm-svn: 262661
* [X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPSSimon Pilgrim2016-03-033-19/+35
| | | | | | | | | | The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic. This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle. Differential Revision: http://reviews.llvm.org/D17681 llvm-svn: 262635
* [X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.Simon Pilgrim2016-03-031-14/+2
| | | | | | lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway. llvm-svn: 262633
* [X86] Pulled out repeated code testing for constant vector shift amount. NFCI.Simon Pilgrim2016-03-031-8/+6
| | | | llvm-svn: 262631
* MCU target has its own ABI, however X86 interrupt handler calling convention ↵Amjad Aboud2016-03-031-1/+3
| | | | | | | | | | overrides this ABI. Fixed the ordering to check first for X86 interrupt handler then for MCU target. Differential Revision: http://reviews.llvm.org/D17801 llvm-svn: 262628
* [X86] Don't assume that shuffle non-mask operands starts at #0.Ahmed Bougacha2016-03-032-32/+68
| | | | | | | | | | | | | | | | | | | | | | | | | That's not the case for VPERMV/VPERMV3, which cover all possible combinations (the C intrinsics use a different order; the AVX vs AVX512 intrinsics are different still). Since: r246981 AVX-512: Lowering for 512-bit vector shuffles. VPERMV is recognized in getTargetShuffleMask. This breaks assumptions in most callers, as they expect the non-mask operands to start at index 0. VPERMV has the mask as operand #0; VPERMV3 has it in the middle. Instead of the faulty assumption, have getTargetShuffleMask return its operands as well. One alternative we considered was to change the operand order of VPERMV, but we agreed to stick to the instruction order, as there are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular). Differential Revision: http://reviews.llvm.org/D17041 llvm-svn: 262627
OpenPOWER on IntegriCloud