summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] Add missing schedinfo, check completeness for Falkor.Balaram Makam2017-04-041-10/+17
| | | | llvm-svn: 299468
* [AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsiaPetr Hosek2017-04-046-12/+34
| | | | | | | | | | | This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462
* [AArch64] Refine Falkor Machine Model - Part 2Balaram Makam2017-04-043-92/+454
| | | | llvm-svn: 299456
* [x86] remove dead select-of-constants transform; NFCISanjay Patel2017-04-041-12/+0
| | | | | | | | https://reviews.llvm.org/D30537 / https://reviews.llvm.org/rL296977 added these transforms and other related transforms to the generic DAGCombiner (with a hook that x86 sets to true), so these patterns should not exist by the time we reach the target-specific combiner hook. llvm-svn: 299448
* AMDGPU: Remove legacy export intrinsicMatt Arsenault2017-04-042-36/+0
| | | | llvm-svn: 299444
* AMDGPU: Remove legacy image intrinsicsMatt Arsenault2017-04-042-217/+0
| | | | llvm-svn: 299443
* [X86][MS-compatability]Allow named synonymous for MS-assembly operatorsCoby Tayree2017-04-041-0/+27
| | | | | | | | | | This patch enhances X86AsmParser's immediate expression parsing abilities, to include a named synonymous for selected binary/unary bitwise operators: {and,shl,shr,or,xor,not}, ultimately achieving better MS-compatability MASM reference: https://msdn.microsoft.com/en-us/library/94b6khh4.aspx Differential Revision: D31277 llvm-svn: 299439
* Strip trailing whitespaceSimon Pilgrim2017-04-041-4/+4
| | | | llvm-svn: 299438
* [X86][LLVM] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into ↵Michael Zuckerman2017-04-041-12/+0
| | | | | | | | | | | generic intrinsics. This patch is a part one of two reviews, one for the clang and the other for LLVM. The patch deletes the back-end intrinsics and adds support for them in the auto upgrade. Differential Revision: https://reviews.llvm.org/D31393 llvm-svn: 299432
* [tablegen][globalisel] Add support for nested instruction matching.Daniel Sanders2017-04-041-36/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Lift the restrictions that prevented the tree walking introduced in the previous change and add support for patterns like: (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3 Also adds support for G_SEXT and G_ZEXT to support these cases. One particular aspect of this that I should draw attention to is that I've tried to be overly conservative in determining the safety of matches that involve non-adjacent instructions and multiple basic blocks. This is intended to be used as a cheap initial check and we may add a more expensive check in the future. The current rules are: * Reject if any instruction may load/store (we'd need to check for intervening memory operations. * Reject if any instruction has implicit operands. * Reject if any instruction has unmodelled side-effects. See isObviouslySafeToFold(). Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka Reviewed By: ab Subscribers: igorb, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30539 llvm-svn: 299430
* [mips] Deal with empty blocks in the mips hazard schedulerSimon Dardis2017-04-041-11/+14
| | | | | | | | | | | | This patch teaches the hazard scheduler how to handle empty blocks when search for the next real instruction when dealing with forbidden slots. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D31293 llvm-svn: 299427
* [X86] Add 64 bit pattern matching for PSADBWOren Ben Simhon2017-04-041-13/+41
| | | | | | | | | PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 llvm-svn: 299425
* Reland r298901 with modifications (reverted in r298932)Weiming Zhao2017-04-031-15/+71
| | | | | | | | | | | | | | | | | | | Dont emit Mapping symbols for sections that contain only data. Summary: Dont emit mapping symbols for sections that contain only data. Reviewers: rengolin, weimingz, kparzysz, t.p.northover, peter.smith Reviewed By: t.p.northover Patched by Shankar Easwaran <shankare@codeaurora.org> Subscribers: alekseyshl, t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D30724 llvm-svn: 299392
* AMDGPU: Remove llvm.SI.vs.load.inputMatt Arsenault2017-04-036-19/+0
| | | | llvm-svn: 299391
* [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + ↵Simon Pilgrim2017-04-031-1/+55
| | | | | | | | | | | | | | | | VECTOR_SHUFFLE It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging. This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values. There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch. Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this. Differential Revision: https://reviews.llvm.org/D31373 llvm-svn: 299387
* x86 interrupt calling convention: re-align stack pointer on 64-bit if an ↵Amjad Aboud2017-04-032-2/+18
| | | | | | | | | | | | | | | | error code was pushed The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated. This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue. Fixes Bug 26413 Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D30049 llvm-svn: 299383
* [CodeGenPrep] move aarch64-type-promotion to CGPJun Bum Lim2017-04-033-1/+37
| | | | | | | | | | | | | | | | | Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379
* AMDGPU: Remove legacy bfe intrinsicsMatt Arsenault2017-04-035-37/+14
| | | | llvm-svn: 299372
* [Hexagon] Factor out some common code in HexagonEarlyIfConv.cpp, NFCKrzysztof Parzyszek2017-04-031-12/+10
| | | | llvm-svn: 299367
* [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt ↵Craig Topper2017-04-032-4/+4
| | | | | | | | | | class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362
* ARMAsmParser: clean up of isImmediate functionsSjoerd Meijer2017-04-035-238/+139
| | | | | | | | | | | | | | | | | - we are now using immediate AsmOperands so that the range check functions are tablegen'ed. - Big bonus is that error messages become much more accurate, i.e. instead of a useless "invalid operand" error message it will not say that the immediate operand must in range [x,y], which is why regression tests needed updating. More tablegen operand descriptions could probably benefit from using immediateAsmOperand, but this is a first good step to get rid of most of the nearly identical range check functions. I will address the remaining immediate operands in next clean ups. Differential Revision: https://reviews.llvm.org/D31333 llvm-svn: 299358
* [X86][MMX] Improve support for folding fptosi from XMM to MMXSimon Pilgrim2017-04-021-0/+10
| | | | llvm-svn: 299338
* [X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64Simon Pilgrim2017-04-021-1/+2
| | | | llvm-svn: 299336
* [X86][MMX] Added support for subvector extraction to MMX registerSimon Pilgrim2017-04-021-2/+4
| | | | llvm-svn: 299335
* [AMDGPU] Garbage collect now unused dead code. NFCI.Davide Italiano2017-04-011-10/+0
| | | | llvm-svn: 299310
* Revert "Feature generic option to setup start/stop-after/before"Quentin Colombet2017-04-011-61/+0
| | | | | | | | This reverts commit r299282. Didn't intend to commit this :( llvm-svn: 299288
* Revert "Instrument SDISel C++ patterns"Quentin Colombet2017-04-012-369/+355
| | | | | | | | This reverts commit r299284. Didn't intend to commit this :( llvm-svn: 299286
* Instrument SDISel C++ patternsQuentin Colombet2017-04-012-355/+369
| | | | llvm-svn: 299284
* Feature generic option to setup start/stop-after/beforeQuentin Colombet2017-04-011-0/+61
| | | | | | | | | | | | | This patch refactors the code used in llc such that all the users of the addPassesToEmitFile API have access to a homogeneous way of handling start/stop-after/before options right out of the box. Previously each user would have needed to duplicate this logic and set up its own options. NFC llvm-svn: 299282
* Reduce the number of times we query the subtarget for the same information.Eric Christopher2017-03-311-5/+4
| | | | llvm-svn: 299278
* Small cleanup to remove extraneous cast.Eric Christopher2017-03-311-2/+1
| | | | llvm-svn: 299277
* [Hexagon] Remove unused variablesKrzysztof Parzyszek2017-03-313-18/+4
| | | | | | Found by PVS-Studio. Fixes llvm.org/PR31676. llvm-svn: 299262
* [Hexagon] Fix typo in HexagonEarlyIfCConv.cppKrzysztof Parzyszek2017-03-311-1/+1
| | | | | | Found by PVS-Studio. Fixes llvm.org/PR32480. llvm-svn: 299258
* [AMDGPU] Remove assumption that vector and scalar types do not aliasStanislav Mekhanoshin2017-03-311-8/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D31547 llvm-svn: 299250
* AMDGPU: Remove unnecessary ands when f16 is legalMatt Arsenault2017-03-316-2/+57
| | | | | | | | | | Add a new node to act as a fancy bitcast from f16 operations to i32 that implicitly zero the high 16-bits of the result. Alternatively could try making v2f16 legal and canonicalizing on build_vectors. llvm-svn: 299246
* AMDGPU/R600: Fix amdgpu alias analysis pass.Jan Vesely2017-03-312-5/+11
| | | | | | | | | R600 uses higher AS number to access kernel parameters Fixes: r298846 Differential Revision: https://reviews.llvm.org/D31520 llvm-svn: 299245
* [AArch64] Add new subtarget feature to fold LSL into address mode.Balaram Makam2017-03-313-5/+53
| | | | | | | | | | | | | Summary: This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL. Reviewers: mcrosier, rengolin, t.p.northover Subscribers: junbuml, gberry, llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D31113 llvm-svn: 299240
* [AVX-512] Update lowering for gather/scatter prefetch intrinsics to match ↵Craig Topper2017-03-311-3/+3
| | | | | | | | | | the immediate encodings the frontend uses based on the _MM_HINT_T0/T1 constant values in clang's headers. Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too. Fixes PR32411. llvm-svn: 299234
* [mips][msa] Prevent output operand from commuting for dpadd_[su].df insPetar Jovanovic2017-03-312-0/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Implementation of TargetInstrInfo::findCommutedOpIndices for MIPS target, restricting commutativity to second and third operand only for dpaadd_[su].df instructions therein. Prior to this change, there were cases where the vector that is to be added to the dot product of the other two could take a position other than the first one in the instruction, generating false output in the destination vector. Such behavior has been noticed in the two functions generating v2i64 output values so far. Other ones may exhibit such behavior as well, just not for the vector operands which are present in the test at the moment. Tests altered so that the function's first operand is a constant splat so that it can be loaded with a ldi instruction, since that is the case in which the erroneous instruction operand placement has occurred. We check that the register which is present in the ldi instruction is placed as the first operand in the corresponding dpadd instruction. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D30827 llvm-svn: 299223
* [SystemZ] Make sure of correct regclasses in insertSelect()Jonas Paulsson2017-03-311-0/+6
| | | | | | | | | Since LOCR only accepts GR32 virtual registers, its operands must be copied into this regclass in insertSelect(), when an LOCR is built. Otherwise, the case where the source operand was GRX32 will produce invalid IR. Review: Ulrich Weigand llvm-svn: 299220
* [DAGCombiner] Add vector demanded elements support to ComputeNumSignBitsSimon Pilgrim2017-03-314-3/+7
| | | | | | | | | | | | | | Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course. Followup to D25691. Differential Revision: https://reviews.llvm.org/D31311 llvm-svn: 299219
* [SystemZ] Skip DAGCombining of vector node for older subtargets.Jonas Paulsson2017-03-311-0/+6
| | | | | | | | | | | | | Even on older subtargets that lack vector support, there may be vector values with just one element in the input program. These are converted during DAG legalization to scalar values. The pre-legalize SystemZ DAGCombiner methods should in this circumstance not touch these nodes. This patch adds a check for this in SystemZTargetLowering::combineEXTRACT_VECTOR_ELT(). Review: Ulrich Weigand llvm-svn: 299213
* [AMDGPU] SDWA Peephole: improve search for immediates in SDWA patternsSam Kolton2017-03-314-43/+75
| | | | | | | | | | | | | | | | | Previously compiler often extracted common immediates into specific register, e.g.: ``` %vreg0 = S_MOV_B32 0xff; %vreg2 = V_AND_B32_e32 %vreg0, %vreg1 %vreg4 = V_AND_B32_e32 %vreg0, %vreg3 ``` Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands: ``` SDWA src: %vreg2 src_sel:BYTE_0 SDWA src: %vreg4 src_sel:BYTE_0 ``` With this change peephole check if operand is either immediate or register that is copy of immediate. llvm-svn: 299202
* [DAGCombiner] Add vector demanded elements support to ↵Simon Pilgrim2017-03-3114-3/+15
| | | | | | | | | | computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201
* Temporarily revert "[PPC] In PPCBoolRetToInt change the bool value to i64 if ↵Eric Christopher2017-03-313-37/+19
| | | | | | | | the target is ppc64" as it's causing test failures, I've given Carrot a testcase offline. This reverts commit r298955. llvm-svn: 299153
* [WebAssembly] Initial linking metadata supportDan Gohman2017-03-306-13/+90
| | | | | | | | | | | | | | | | Add support for the new relocations and linking metadata section support in https://github.com/WebAssembly/tool-conventions/blob/master/Linking.md. In particular, this allows LLVM to indicate which variable is the stack pointer, so that it can be linked with other objects. This also adds support for emitting type relocations for call_indirect instructions. Right now, this is mainly tested by using wabt and hexdump to examine the output on selected testcases. We'll add more tests as the design stablizes and more of the pieces are in place. llvm-svn: 299141
* AMDGPU: Rename isKernelMatt Arsenault2017-03-303-6/+22
| | | | | | | | What we really want to do is distinguish functions that may be called by other functions, and graphics shaders are not called kernels. llvm-svn: 299140
* AMDGPU: Add all atomicrmw fields to atomic.inc/decMatt Arsenault2017-03-301-2/+5
| | | | | | Add scope, order, isVolatile llvm-svn: 299122
* [AVX-512] Fix bad comment from r299112. NFCCraig Topper2017-03-301-1/+2
| | | | llvm-svn: 299114
* [AVX-512] Fix another case where fastisel was generating a GR8 to VK1 copy. ↵Craig Topper2017-03-301-2/+12
| | | | | | | | This time after calls returning i1. Fixes PR32472. llvm-svn: 299112
OpenPOWER on IntegriCloud