summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [WebAssembly] Encode block signatures as SLEB instead of ULEBDerek Schuff2017-04-171-0/+2
| | | | | | | | Use SLEB (varint) for block_type immediates in accordance with the spec. Patch by Yury Delendik llvm-svn: 300490
* AMDGPU: Use MachineRegisterInfo to find max used registerMatt Arsenault2017-04-172-128/+77
| | | | | | | | | | Avoid looping through program to determine register counts. This avoids needing to look at regmask operands. Also fixes some counting errors with flat_scr when there are no stack objects. llvm-svn: 300482
* AMDGPU: Change stack alignmentMatt Arsenault2017-04-171-2/+4
| | | | | | | | | While the incoming stack for a kernel is 256-byte aligned, this refers to the base address of the entire wave. This isn't useful information for most of codegen. Fixes unnecessarily aligning stack objects in callees. llvm-svn: 300481
* Unbreak build of the wasm backend after r300463.Benjamin Kramer2017-04-171-2/+2
| | | | llvm-svn: 300479
* AArch64: put nonlazybind special handling behind a flag for now.Tim Northover2017-04-171-1/+6
| | | | | | | | It's basically a terrible idea anyway but objc_msgSend gets emitted like that. We can decide on a better way to deal with it in the unlikely event that anyone actually uses it. llvm-svn: 300474
* AMDGPU: Set CodePointerSize to 8 for amdgcnKonstantin Zhuravlyov2017-04-171-0/+1
| | | | llvm-svn: 300470
* Distinguish between code pointer size and DataLayout::getPointerSize() in ↵Konstantin Zhuravlyov2017-04-1710-15/+15
| | | | | | DWARF info generation llvm-svn: 300463
* AArch64: support nonlazybindTim Northover2017-04-173-19/+35
| | | | | | | | It's almost certainly not a good idea to actually use it in most cases (there's a pretty large code size overhead on AArch64), but we can't do those experiments until it's supported. llvm-svn: 300462
* [X86] Remove special handling for 16 bit for A asm constraints.Benjamin Kramer2017-04-162-7/+3
| | | | | | | | | | Our 16 bit support is assembler-only + the terrible hack that is .code16gcc. Simply using 32 bit registers does the right thing for the latter. Fixes PR32681. llvm-svn: 300429
* Use correct registers for "A" inline asm constraintDimitry Andric2017-04-152-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In PR32594, inline assembly using the 'A' constraint on x86_64 causes llvm to crash with a "Cannot select" stack trace. This is because `X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A' means the EAX and EDX registers. However, on x86_64 it means the RAX and RDX registers, and on 16-bit x86 (ia16?) it means the old AX and DX registers. Add new register classes in `X86RegisterInfo.td` to support these cases, and amend the logic in `getRegForInlineAsmConstraint` to cope with different subtargets. Also add a test case, derived from PR32594. Reviewers: craig.topper, qcolombet, RKSimon, ab Reviewed By: ab Subscribers: ab, emaste, royger, llvm-commits Differential Revision: https://reviews.llvm.org/D31902 llvm-svn: 300404
* [RDF] No longer ignore implicit defs or uses on any instructionsKrzysztof Parzyszek2017-04-141-23/+0
| | | | | | | This used to be a Hexagon-specific treatment, but is no longer needed since it's switched to subregister liveness tracking. llvm-svn: 300369
* [RDF] Correctly enumerate reg units for reg masksKrzysztof Parzyszek2017-04-141-3/+5
| | | | llvm-svn: 300368
* [IR] Make paramHasAttr to use arg indices instead of attr indicesReid Kleckner2017-04-143-12/+12
| | | | | | | | | This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern. Previously we were testing return value attributes with index 0, so I introduced hasReturnAttr() for that use case. llvm-svn: 300367
* [AMDGPU] set read_only access qualifier for pointersStanislav Mekhanoshin2017-04-141-3/+8
| | | | | | | | | | If a kernel's pointer argument is known to be readonly set access qualifier accordingly. This allows RT not to flush caches before dispatches. Differential Revision: https://reviews.llvm.org/D32091 llvm-svn: 300362
* [RDF] Switch RegisterAggr to a bit vector of register unitsKrzysztof Parzyszek2017-04-145-185/+168
| | | | | | | This avoids many complications related to the complex register aliasing schemes. llvm-svn: 300345
* [RDF] Refine propagation of reached uses in liveness computationKrzysztof Parzyszek2017-04-143-5/+63
| | | | llvm-svn: 300337
* [Hexagon] Fix a latent problem with interpreting live-in lane masksKrzysztof Parzyszek2017-04-141-5/+7
| | | | | | | A non-zero lane mask on a register with no subregister means that the whole register is live-in. It is equivalent to a full mask. llvm-svn: 300335
* [Hexagon] Make a couple of passes compliant with -opt-bisect-limitKrzysztof Parzyszek2017-04-142-0/+5
| | | | llvm-svn: 300329
* [X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM)Simon Pilgrim2017-04-142-12/+6
| | | | | | | | | | MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. Clang companion patch: D31766. Differential Revision: https://reviews.llvm.org/D31767 llvm-svn: 300325
* [AMDGPU][MC] Corrected ds_write_src2_* to require one offset instead of two.Dmitry Preobrazhensky2017-04-141-14/+2
| | | | | | | | | | Fixed bug 32551: https://bugs.llvm.org//show_bug.cgi?id=32551 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31809 llvm-svn: 300319
* [AMDGPU][MC] Enabled constants for src operands of s_cbranch_g_forkDmitry Preobrazhensky2017-04-141-1/+1
| | | | | | | | | | Fixed bug 32619: https://bugs.llvm.org//show_bug.cgi?id=32619 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31973 llvm-svn: 300318
* This patch closes PR#32216: Better testing of schedule model instruction ↵Andrew V. Tischenko2017-04-1410-19/+26
| | | | | | | | latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311
* [AMDGPU] added SIInstrInfo::getAddNoCarry() helperStanislav Mekhanoshin2017-04-144-23/+44
| | | | | | | | Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
* [AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16Adam Nemet2017-04-131-3/+8
| | | | | | | | | | | | This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276
* AMDGPU/GFX9: Do not use v_pack_b32_f16 when packingKonstantin Zhuravlyov2017-04-131-29/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31819 llvm-svn: 300275
* [IR] Make getParamAttributes take argument numbers, not ArgNo+1Reid Kleckner2017-04-136-21/+20
| | | | | | | | | | | | Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
* [bpf] Fix memory offset check for loads and storesAlexei Starovoitov2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void *ign) { volatile unsigned long t = 0x8983984739ull; return *(unsigned long *)((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = *(u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269
* Fix -Wunused-value warningReid Kleckner2017-04-131-6/+6
| | | | llvm-svn: 300254
* [AMDGPU] Combine DS operations with offsets bigger than byteStanislav Mekhanoshin2017-04-131-150/+166
| | | | | | | | | In many cases ds operations can be combined even if offsets do not fit into 8 bit encoding. What it takes is to adjust base address. Differential Revision: https://reviews.llvm.org/D31993 llvm-svn: 300227
* [Hexagon] Implement HexagonTargetLowering::CanLowerReturnKrzysztof Parzyszek2017-04-132-12/+18
| | | | | | | | Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D32000 llvm-svn: 300199
* [Hexagon] Fix "LowerFormalArguments emitted a value with the wrong type!" ↵Krzysztof Parzyszek2017-04-131-1/+1
| | | | | | | | | | assertion Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D31999 llvm-svn: 300198
* Use methods to access data stored with frame instructionsSerge Pavlov2017-04-135-20/+28
| | | | | | | | | | | | | Instructions CALLSEQ_START..CALLSEQ_END and their target dependent counterparts keep data like frame size, stack adjustment etc. These data are accessed by getOperand using hard coded indices. It is error prone way. This change implements the access by special methods, which improve readability and allow changing data representation without massive changes of index values. Differential Revision: https://reviews.llvm.org/D31953 llvm-svn: 300196
* [X86] Added missing mayLoad/mayStore attributes to some X86 instructions.Ayman Musa2017-04-137-19/+55
| | | | | | | | | Throughout the effort of automatically generating the X86 memory folding tables these missing information were encountered. This is a preparation work for a future patch including the automation of these tables. Differential Revision: https://reviews.llvm.org/D31714 llvm-svn: 300190
* [X86] Change instructions names to keep consistency with the naming ↵Ayman Musa2017-04-131-2/+2
| | | | | | | | convention. NFC Differential Revision: https://reviews.llvm.org/D31743 llvm-svn: 300184
* [IR] Take func, ret, and arg attrs separately in AttributeList::getReid Kleckner2017-04-131-8/+6
| | | | | | | | | | | | | This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153
* AMDGPU : Fix common dominator of two incoming blocks terminates with uniform ↵Wei Ding2017-04-121-2/+24
| | | | | | | | branch issue. Differential Revision: http://reviews.llvm.org/D31350 llvm-svn: 300142
* AMDGPU: Fix invalid copies when copying i1 to phys regMatt Arsenault2017-04-123-4/+30
| | | | | | | Insert a VReg_1 virtual register so the i1 workaround pass can handle it. llvm-svn: 300113
* [AMDGPU] Generate range metadata for workitem idStanislav Mekhanoshin2017-04-126-24/+118
| | | | | | | | | If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 llvm-svn: 300102
* [AMDGPU][MC] Added support for several VI-specific opcodes (s_wakeup, etc)Dmitry Preobrazhensky2017-04-123-1/+37
| | | | | | | | | | | | | | | | | | | | | | Added support for VI: - s_endpgm_saved - s_wakeup - s_rfe_restore_b64 - v_perm_b32 Enabled for VI: - v_mov_fed_b32 - v_mov_fed_b32_e64 See bug 32593: https://bugs.llvm.org//show_bug.cgi?id=32593 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31931 llvm-svn: 300076
* [AMDGPU][MC] Corrected parsing of v_cmp_class* and v_cmpx_class*Dmitry Preobrazhensky2017-04-122-2/+4
| | | | | | | | | | Fixed bug 32565: https://bugs.llvm.org//show_bug.cgi?id=32565 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31820 llvm-svn: 300073
* [WebAssembly] Update use of Attributes after r299875Derek Schuff2017-04-121-17/+14
| | | | | | This fixes the failing WebAssemblyLowerEmscriptenEHSjLj tests llvm-svn: 300072
* [AMDGPU][MC] Corrected encoding of V_MQSAD_U32_U8 for CIDmitry Preobrazhensky2017-04-121-1/+1
| | | | | | | | | | | | Corrected encoding of V_MQSAD_U32_U8 for CI See bug 32552: https://bugs.llvm.org//show_bug.cgi?id=32552 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31810 llvm-svn: 300070
* Fix the bootstrap failure caused by r299986.Easwaran Raman2017-04-121-0/+4
| | | | llvm-svn: 300069
* [AMDGPU][MC] Corrected ds_wrxchg2* to support two offsetsDmitry Preobrazhensky2017-04-121-7/+21
| | | | | | | | | | Fixed bug 28227: https://bugs.llvm.org//show_bug.cgi?id=28227 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31808 llvm-svn: 300066
* [GlobalIsel][X86] support G_CONSTANT selection.Igor Breger2017-04-121-1/+52
| | | | | | | | | | | | | | Summary: [GlobalISel][X86] support G_CONSTANT selection. Add regbank select tests. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31974 llvm-svn: 300057
* [LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore()Jonas Paulsson2017-04-121-0/+1
| | | | | | | | | | | | | | | | | | | Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056
* [AMDGPU][MC] Corrected src0 size for s_cbranch_joinDmitry Preobrazhensky2017-04-121-1/+7
| | | | | | | | | | Fix for bug 28159: https://bugs.llvm.org//show_bug.cgi?id=28159 Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31595 llvm-svn: 300055
* [SystemZ] TargetTransformInfo cost functions implemented.Jonas Paulsson2017-04-1211-32/+620
| | | | | | | | | | | | | | | | getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
* [AMDGPU] SDWA: make pass globalSam Kolton2017-04-121-183/+175
| | | | | | | | | | | | Summary: Remove checks for basic blocks. Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31935 llvm-svn: 300040
* [AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing.Kannan Narayanan2017-04-125-1/+1881
| | | | | | Based on comments in https://reviews.llvm.org/D31161. llvm-svn: 300023
OpenPOWER on IntegriCloud