summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [AMDGPU] added SIInstrInfo::getAddNoCarry() helperStanislav Mekhanoshin2017-04-144-23/+44
| | | | | | | | Addressed rest of post submit comments from D31993. Differential Revision: https://reviews.llvm.org/D32057 llvm-svn: 300288
* [AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16Adam Nemet2017-04-131-3/+8
| | | | | | | | | | | | This further improves Ahmed's change in rL299482. See the new comment for the rationale. The patch recovers most of the regression for bzip2 after D31965. We're down to +2.68% from +6.97%. Differential Revision: https://reviews.llvm.org/D32028 llvm-svn: 300276
* AMDGPU/GFX9: Do not use v_pack_b32_f16 when packingKonstantin Zhuravlyov2017-04-131-29/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31819 llvm-svn: 300275
* [IR] Make getParamAttributes take argument numbers, not ArgNo+1Reid Kleckner2017-04-136-21/+20
| | | | | | | | | | | | Add hasParamAttribute() and use it instead of hasAttribute(ArgNo+1, Kind) everywhere. The fact that the AttributeList index for an argument is ArgNo+1 should be a hidden implementation detail. NFC llvm-svn: 300272
* [bpf] Fix memory offset check for loads and storesAlexei Starovoitov2017-04-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the offset cannot fit into the instruction, an addition to the pointer is emitted before the actual access. However, BPF offsets are 16-bit but LLVM considers them to be, for the matter of this check, to be 32-bit long. This causes the following program: int bpf_prog1(void *ign) { volatile unsigned long t = 0x8983984739ull; return *(unsigned long *)((0xffffffff8fff0002ull) + t); } To generate the following (wrong) code: 0: 18 01 00 00 39 47 98 83 00 00 00 00 89 00 00 00 r1 = 590618314553ll 2: 7b 1a f8 ff 00 00 00 00 *(u64 *)(r10 - 8) = r1 3: 79 a1 f8 ff 00 00 00 00 r1 = *(u64 *)(r10 - 8) 4: 79 10 02 00 00 00 00 00 r0 = *(u64 *)(r1 + 2) 5: 95 00 00 00 00 00 00 00 exit Fix it by changing the offset check to 16-bit. Patch by Nadav Amit <nadav.amit@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Differential Revision: https://reviews.llvm.org/D32055 llvm-svn: 300269
* Fix -Wunused-value warningReid Kleckner2017-04-131-6/+6
| | | | llvm-svn: 300254
* [AMDGPU] Combine DS operations with offsets bigger than byteStanislav Mekhanoshin2017-04-131-150/+166
| | | | | | | | | In many cases ds operations can be combined even if offsets do not fit into 8 bit encoding. What it takes is to adjust base address. Differential Revision: https://reviews.llvm.org/D31993 llvm-svn: 300227
* [Hexagon] Implement HexagonTargetLowering::CanLowerReturnKrzysztof Parzyszek2017-04-132-12/+18
| | | | | | | | Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D32000 llvm-svn: 300199
* [Hexagon] Fix "LowerFormalArguments emitted a value with the wrong type!" ↵Krzysztof Parzyszek2017-04-131-1/+1
| | | | | | | | | | assertion Patch by Michael Wu. Differential Revision: https://reviews.llvm.org/D31999 llvm-svn: 300198
* Use methods to access data stored with frame instructionsSerge Pavlov2017-04-135-20/+28
| | | | | | | | | | | | | Instructions CALLSEQ_START..CALLSEQ_END and their target dependent counterparts keep data like frame size, stack adjustment etc. These data are accessed by getOperand using hard coded indices. It is error prone way. This change implements the access by special methods, which improve readability and allow changing data representation without massive changes of index values. Differential Revision: https://reviews.llvm.org/D31953 llvm-svn: 300196
* [X86] Added missing mayLoad/mayStore attributes to some X86 instructions.Ayman Musa2017-04-137-19/+55
| | | | | | | | | Throughout the effort of automatically generating the X86 memory folding tables these missing information were encountered. This is a preparation work for a future patch including the automation of these tables. Differential Revision: https://reviews.llvm.org/D31714 llvm-svn: 300190
* [X86] Change instructions names to keep consistency with the naming ↵Ayman Musa2017-04-131-2/+2
| | | | | | | | convention. NFC Differential Revision: https://reviews.llvm.org/D31743 llvm-svn: 300184
* [IR] Take func, ret, and arg attrs separately in AttributeList::getReid Kleckner2017-04-131-8/+6
| | | | | | | | | | | | | This seems like a much more natural API, based on Derek Schuff's comments on r300015. It further hides the implementation detail of AttributeList that function attributes come last and appear at index ~0U, which is easy for the user to screw up. git diff says it saves code as well: 97 insertions(+), 137 deletions(-) This also makes it easier to change the implementation, which I want to do next. llvm-svn: 300153
* AMDGPU : Fix common dominator of two incoming blocks terminates with uniform ↵Wei Ding2017-04-121-2/+24
| | | | | | | | branch issue. Differential Revision: http://reviews.llvm.org/D31350 llvm-svn: 300142
* AMDGPU: Fix invalid copies when copying i1 to phys regMatt Arsenault2017-04-123-4/+30
| | | | | | | Insert a VReg_1 virtual register so the i1 workaround pass can handle it. llvm-svn: 300113
* [AMDGPU] Generate range metadata for workitem idStanislav Mekhanoshin2017-04-126-24/+118
| | | | | | | | | If workgroup size is known inform llvm about range returned by local id and local size queries. Differential Revision: https://reviews.llvm.org/D31804 llvm-svn: 300102
* [AMDGPU][MC] Added support for several VI-specific opcodes (s_wakeup, etc)Dmitry Preobrazhensky2017-04-123-1/+37
| | | | | | | | | | | | | | | | | | | | | | Added support for VI: - s_endpgm_saved - s_wakeup - s_rfe_restore_b64 - v_perm_b32 Enabled for VI: - v_mov_fed_b32 - v_mov_fed_b32_e64 See bug 32593: https://bugs.llvm.org//show_bug.cgi?id=32593 Reviewers: artem.tamazov, vpykhtin Differential Revision: https://reviews.llvm.org/D31931 llvm-svn: 300076
* [AMDGPU][MC] Corrected parsing of v_cmp_class* and v_cmpx_class*Dmitry Preobrazhensky2017-04-122-2/+4
| | | | | | | | | | Fixed bug 32565: https://bugs.llvm.org//show_bug.cgi?id=32565 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31820 llvm-svn: 300073
* [WebAssembly] Update use of Attributes after r299875Derek Schuff2017-04-121-17/+14
| | | | | | This fixes the failing WebAssemblyLowerEmscriptenEHSjLj tests llvm-svn: 300072
* [AMDGPU][MC] Corrected encoding of V_MQSAD_U32_U8 for CIDmitry Preobrazhensky2017-04-121-1/+1
| | | | | | | | | | | | Corrected encoding of V_MQSAD_U32_U8 for CI See bug 32552: https://bugs.llvm.org//show_bug.cgi?id=32552 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31810 llvm-svn: 300070
* Fix the bootstrap failure caused by r299986.Easwaran Raman2017-04-121-0/+4
| | | | llvm-svn: 300069
* [AMDGPU][MC] Corrected ds_wrxchg2* to support two offsetsDmitry Preobrazhensky2017-04-121-7/+21
| | | | | | | | | | Fixed bug 28227: https://bugs.llvm.org//show_bug.cgi?id=28227 Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31808 llvm-svn: 300066
* [GlobalIsel][X86] support G_CONSTANT selection.Igor Breger2017-04-121-1/+52
| | | | | | | | | | | | | | Summary: [GlobalISel][X86] support G_CONSTANT selection. Add regbank select tests. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31974 llvm-svn: 300057
* [LoopVectorizer, TTI] New method supportsEfficientVectorElementLoadStore()Jonas Paulsson2017-04-121-0/+1
| | | | | | | | | | | | | | | | | | | Since SystemZ supports vector element load/store instructions, there is no need for extracts/inserts if a vector load/store gets scalarized. This patch lets Target specify that it supports such instructions by means of a new TTI hook that defaults to false. The use for this is in the LoopVectorizer getScalarizationOverhead() method, which will with this patch produce a smaller sum for a vector load/store on SystemZ. New test: test/Transforms/LoopVectorize/SystemZ/load-store-scalarization-cost.ll Review: Adam Nemet https://reviews.llvm.org/D30680 llvm-svn: 300056
* [AMDGPU][MC] Corrected src0 size for s_cbranch_joinDmitry Preobrazhensky2017-04-121-1/+7
| | | | | | | | | | Fix for bug 28159: https://bugs.llvm.org//show_bug.cgi?id=28159 Reviewers: vpykhtin, arsenm Differential Revision: https://reviews.llvm.org/D31595 llvm-svn: 300055
* [SystemZ] TargetTransformInfo cost functions implemented.Jonas Paulsson2017-04-1211-32/+620
| | | | | | | | | | | | | | | | getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(), getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(), getInterleavedMemoryOpCost() implemented. Interleaved access vectorization enabled. BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads, in which case the cost of the z/sext instruction becomes 0. Review: Ulrich Weigand, Renato Golin. https://reviews.llvm.org/D29631 llvm-svn: 300052
* [AMDGPU] SDWA: make pass globalSam Kolton2017-04-121-183/+175
| | | | | | | | | | | | Summary: Remove checks for basic blocks. Reviewers: vpykhtin, rampitec, arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31935 llvm-svn: 300040
* [AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing.Kannan Narayanan2017-04-125-1/+1881
| | | | | | Based on comments in https://reviews.llvm.org/D31161. llvm-svn: 300023
* Revert "[WebAssembly] Update use of Attributes after r299875"Derek Schuff2017-04-121-14/+17
| | | | | | | | This reverts commit 2a0eb61dcccb15058d5b2a572bb3da0cf47fd550, r300015 I raced with rnk on the commit. llvm-svn: 300016
* [WebAssembly] Update use of Attributes after r299875Derek Schuff2017-04-121-17/+14
| | | | | | This fixes the failing WebAssemblyLowerEmscriptenEHSjLj tests llvm-svn: 300015
* AMDGPU: Insert wait at start of callee functionsMatt Arsenault2017-04-111-0/+14
| | | | llvm-svn: 300000
* AMDGPU: Refactor SIMachineFunctionInfo slightlyMatt Arsenault2017-04-113-16/+38
| | | | | | Prepare for handling non-entry functions. llvm-svn: 299999
* AMDGPU: Refactor argument loweringMatt Arsenault2017-04-1110-276/+375
| | | | | | | Split into smaller functions and prepare for handling non-entry functions. llvm-svn: 299998
* AMDGPU: Fix folding reg_sequence into copy to phys regMatt Arsenault2017-04-111-0/+4
| | | | | | | This was producing an illegal reg_sequence defining a physical register with virtual register inputs. llvm-svn: 299997
* AMDGPU: Prune unecessary includeMatt Arsenault2017-04-111-2/+0
| | | | llvm-svn: 299996
* [AArch64] Fix scheduling info for INS(vector, general) instruction.Balaram Makam2017-04-112-1/+6
| | | | llvm-svn: 299994
* [x86] Relax the check in areLoadsFromSameBasePtrEaswaran Raman2017-04-111-19/+16
| | | | | | | | | Check if the scale operand is identical (doesn't have to be 1) and do not check the chaain operand. Differential revision: https://reviews.llvm.org/D31833 llvm-svn: 299986
* [AArch64] Simplify MacroFusionEvandro Menezes2017-04-111-79/+89
| | | | | | | | | | | | This patch assumes that the dependents to be scanned for the ExitSU are its predecessors; otherwise, the successors of the instr are scanned. Furthermore, sometimes the ExitSU was being fused twice, since it may be fused once when scanning the successors from the beginning of the BB and then again when scanning the predecessors of ExitSU. Thus, when scanning the successors of an instr, skip the ExitSU. llvm-svn: 299974
* [X86] Create the correct ADC/SBB SDNode when lowering add.Davide Italiano2017-04-111-2/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D31911 llvm-svn: 299973
* Fix spelling compliment->complement. Mostly refering to 2s complement. NFCCraig Topper2017-04-113-4/+4
| | | | llvm-svn: 299970
* [AMDGPU] Add A5 to data layout for amdgiz environmentYaxun Liu2017-04-111-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D31589 llvm-svn: 299964
* Module::getOrInsertFunction is using C-style vararg instead of variadic ↵Serge Guelton2017-04-113-4/+3
| | | | | | | | | | | templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299949
* Remove unused functions. Remove static qualifier from functions in header ↵Vassil Vassilev2017-04-111-10/+0
| | | | | | files. NFC. llvm-svn: 299947
* [AVR] Migrate to new MCAsmBackend applyFixupJonathan Roelofs2017-04-112-2/+2
| | | | | | | | https://reviews.llvm.org/D31875 Patch by Leslie Zhai! llvm-svn: 299946
* [ARM] Refactor Thumb2 sat instructionsSam Parker2017-04-111-48/+30
| | | | | | | | | Refactor the USAT, SSAT, USAT16 and SSAT16 instruction descriptions for Thumb2. Differential Revision: https://reviews.llvm.org/D31933 llvm-svn: 299945
* GlobalISel: Allow legalizing G_FADD to a libcallDiana Picus2017-04-111-0/+3
| | | | | | | | | Use the same handling in the generic legalizer code as for the other libcalls (G_FREM, G_FPOW). Enable it on ARM for float and double so we can test it. llvm-svn: 299931
* Revert "Turn some C-style vararg into variadic templates"Diana Picus2017-04-113-3/+4
| | | | | | | This reverts commit r299925 because it broke the buildbots. See e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008 llvm-svn: 299928
* Turn some C-style vararg into variadic templatesSerge Guelton2017-04-113-4/+3
| | | | | | | | | | | | Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. llvm-svn: 299925
* [PowerPC] multiply-with-overflow might use the CTR registerHal Finkel2017-04-111-9/+11
| | | | | | | | | | | | Check the legality of ISD::[US]MULO to see whether Intrinsic::[us]mul_with_overflow will legalize into a function call (and, thus, will use the CTR register). Fixes PR32485. Patch by Tim Neumann! Differential Revision: https://reviews.llvm.org/D31790 llvm-svn: 299910
* Allow DataLayout to specify addrspace for allocas.Matt Arsenault2017-04-101-1/+2
| | | | | | | | | | | | | | | | | | | | | | | LLVM makes several assumptions about address space 0. However, alloca is presently constrained to always return this address space. There's no real way to avoid using alloca, so without this there is no way to opt out of these assumptions. The problematic assumptions include: - That the pointer size used for the stack is the same size as the code size pointer, which is also the maximum sized pointer. - That 0 is an invalid, non-dereferencable pointer value. These are problems for AMDGPU because alloca is used to implement the private address space, which uses a 32-bit index as the pointer value. Other pointers are 64-bit and behave more like LLVM's notion of generic address space. By changing the address space used for allocas, we can change our generic pointer type to be LLVM's generic pointer type which does have similar properties. llvm-svn: 299888
OpenPOWER on IntegriCloud