summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [Hexagon] Change the vector scaling for vector offsetsKrzysztof Parzyszek2017-04-0610-397/+456
| | | | | | | Keep full offset value on MI-level instructions, but have it scaled down in the MC-level instructions. llvm-svn: 299664
* [AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront sizeStanislav Mekhanoshin2017-04-061-0/+11
| | | | | | | | | | If a workgroup size is known to be not greater than wavefront size the s_barrier instruction is not needed since all threads are guarantied to come to the same point at the same time. Differential Revision: https://reviews.llvm.org/D31731 llvm-svn: 299659
* [AMDGPU] Resubmit SDWA peephole: enable by defaultSam Kolton2017-04-062-6/+5
| | | | | | | | | | Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299654
* [globalisel][tablegen] Move <Target>InstructionSelector declarations to ↵Daniel Sanders2017-04-0610-153/+120
| | | | | | | | | | | | | | | | anonymous namespaces Summary: This resolves the issue of tablegen-erated includes in the headers for non-GlobalISel builds in a simpler way than before. Reviewers: qcolombet, ab Reviewed By: ab Subscribers: igorb, ab, mgorny, dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30998 llvm-svn: 299637
* [ARM] Remove a dead ADD during the creation of TBBsDavid Green2017-04-061-1/+42
| | | | | | | | | During the optimisation of jump tables in the constant island pass, an extra ADD could be left over, now dead but not removed. Differential Revision: https://reviews.llvm.org/D31389 llvm-svn: 299634
* [X86 TTI] Implement LSV hookKeno Fischer2017-04-052-2/+7
| | | | | | | | | | | | | | | | | | Summary: LSV wants to know the maximum size that can be loaded to a vector register. On X86, this always matches the maximum register width. Implement this accordingly and add a test to make sure that LSV can vectorize up to the maximum permissible width on X86. Reviewers: delena, arsenm Reviewed By: arsenm Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D31504 llvm-svn: 299589
* Revert r299536. [AMDGPU] SDWA peephole: enable by default.Ivan Krasin2017-04-051-1/+1
| | | | | | | | | | | Reason: breaks multiple bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173 Original Review URL: https://reviews.llvm.org/D31671 llvm-svn: 299583
* [AMDGPU][MC] Fix for Bug 28158 + LIT testsDmitry Preobrazhensky2017-04-051-0/+20
| | | | | | | | | | | | | | | Added support of the following instructions: - s_cbranch_cdbgsys - s_cbranch_cdbgsys_and_user - s_cbranch_cdbgsys_or_user - s_cbranch_cdbguser - s_setkill Reviewers: vpykhtin Differential Revision: https://reviews.llvm.org/D31469 llvm-svn: 299567
* ARMFrameLowering: Slight cleanups; NFCMatthias Braun2017-04-051-4/+5
| | | | llvm-svn: 299562
* [AMDGPU][MC] Fix for Bug 28167 + LIT testsDmitry Preobrazhensky2017-04-051-1/+4
| | | | | | | | | | | | Corrected src0 for v_writelane_b32: - Enabled inline constants and literals for SI/CI (VOP2) - Enabled inline constants for VI (VOP3) Reviewers: vpykhtin, arsenm https://reviews.llvm.org/D31463 llvm-svn: 299555
* [SystemZ] Prevent Merging Bitcast with non-normal loadsNirav Dave2017-04-051-2/+3
| | | | | | | | | | | | Fixes PR32505. Reviewers: uweigand, jonpa Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31609 llvm-svn: 299552
* [DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to ↵Sanjay Patel2017-04-053-0/+12
| | | | | | | | | | | | | | | | | bitwise logic+setcc (PR32401) This is a generic combine enabled via target hook to reduce icmp logic as discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 It's likely that other targets will want to enable this hook for scalar transforms, and there are probably other patterns that can use bitwise logic to reduce comparisons. Note that we are missing an IR canonicalization for these patterns, and we will probably prefer the pair-of-compares form in IR (shorter, more likely to fold). Differential Revision: https://reviews.llvm.org/D31483 llvm-svn: 299542
* [AMDGPU] SDWA peephole: enable by defaultSam Kolton2017-04-051-1/+1
| | | | | | | | | | Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299536
* Fix WebAssembly after r299529.Alexander Kornienko2017-04-051-4/+4
| | | | llvm-svn: 299535
* [X86][SSE] Renamed combine to make it clear that it only handles the vector ↵Simon Pilgrim2017-04-051-4/+5
| | | | | | shift by immediate opcodes. NFCI llvm-svn: 299532
* [AArch64] Crypto requires FP.James Molloy2017-04-051-1/+1
| | | | | | So if FP is disabled, crypto should also be disabled. llvm-svn: 299531
* Add MCContext argument to MCAsmBackend::applyFixup for error reportingAlex Bradbury2017-04-0515-168/+123
| | | | | | | | | | | | | | | | A number of backends (AArch64, MIPS, ARM) have been using MCContext::reportError to report issues such as out-of-range fixup values in their TgtAsmBackend. This is great, but because MCContext couldn't easily be threaded through to the adjustFixupValue helper function from its usual callsite (applyFixup), these backends ended up adding an MCContext* argument and adding another call to applyFixup to processFixupValue. Adding an MCContext parameter to applyFixup makes this unnecessary, and even better - applyFixup can take a reference to MCContext rather than a potentially null pointer. Differential Revision: https://reviews.llvm.org/D30264 llvm-svn: 299529
* [X86] Relax assert in broadcast-of-subvector lowering.Ahmed Bougacha2017-04-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before r294774, there was a problem when lowering broadcasts to use 128-bit subvectors. When we looked through a bitcast to find the broadcast input, we'd keep using the original type, so you'd end up with things like: (v8f32 (broadcast (v4f32 (extract_subvector (v8i32 V), ...)) )) r294774 fixed it to always emit subvectors with the scalar type of the original source. It also introduced some asserts, to check that we use scalars with the same size, and vectors with the same number of elements. The scalar size equality is checked earlier when looking through bitcasts, and is a useful assert. However, the number of elements don't have to be identical: we're always going to extract a 128-bit subvector, and we can have different size inputs if we looked through a concat_vector to find a 256-bit source. Relax the overzealous assert. Replace it with a check of the original source vector being 256 or 512 bits. If it's 128 bits, we can't extract_subvector from it. Fixes PR32371. llvm-svn: 299490
* [AArch64] Avoid partial register deps on insertelt of load into lane 0.Ahmed Bougacha2017-04-041-11/+5
| | | | | | | | | | | | | | | This improves upon r246462: that prevented FMOVs from being emitted for the cross-class INSERT_SUBREGs by disabling the formation of INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting caused us to introduce partial dependencies on the vector register. Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that is folded away by many patterns, including the scalar LDRS that we want in this case. Credit goes to Adam for finding the issue! llvm-svn: 299482
* [AArch64] Add missing schedinfo, check completeness for Falkor.Balaram Makam2017-04-041-10/+17
| | | | llvm-svn: 299468
* [AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsiaPetr Hosek2017-04-046-12/+34
| | | | | | | | | | | This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462
* [AArch64] Refine Falkor Machine Model - Part 2Balaram Makam2017-04-043-92/+454
| | | | llvm-svn: 299456
* [x86] remove dead select-of-constants transform; NFCISanjay Patel2017-04-041-12/+0
| | | | | | | | https://reviews.llvm.org/D30537 / https://reviews.llvm.org/rL296977 added these transforms and other related transforms to the generic DAGCombiner (with a hook that x86 sets to true), so these patterns should not exist by the time we reach the target-specific combiner hook. llvm-svn: 299448
* AMDGPU: Remove legacy export intrinsicMatt Arsenault2017-04-042-36/+0
| | | | llvm-svn: 299444
* AMDGPU: Remove legacy image intrinsicsMatt Arsenault2017-04-042-217/+0
| | | | llvm-svn: 299443
* [X86][MS-compatability]Allow named synonymous for MS-assembly operatorsCoby Tayree2017-04-041-0/+27
| | | | | | | | | | This patch enhances X86AsmParser's immediate expression parsing abilities, to include a named synonymous for selected binary/unary bitwise operators: {and,shl,shr,or,xor,not}, ultimately achieving better MS-compatability MASM reference: https://msdn.microsoft.com/en-us/library/94b6khh4.aspx Differential Revision: D31277 llvm-svn: 299439
* Strip trailing whitespaceSimon Pilgrim2017-04-041-4/+4
| | | | llvm-svn: 299438
* [X86][LLVM] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into ↵Michael Zuckerman2017-04-041-12/+0
| | | | | | | | | | | generic intrinsics. This patch is a part one of two reviews, one for the clang and the other for LLVM. The patch deletes the back-end intrinsics and adds support for them in the auto upgrade. Differential Revision: https://reviews.llvm.org/D31393 llvm-svn: 299432
* [tablegen][globalisel] Add support for nested instruction matching.Daniel Sanders2017-04-041-36/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Lift the restrictions that prevented the tree walking introduced in the previous change and add support for patterns like: (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3 Also adds support for G_SEXT and G_ZEXT to support these cases. One particular aspect of this that I should draw attention to is that I've tried to be overly conservative in determining the safety of matches that involve non-adjacent instructions and multiple basic blocks. This is intended to be used as a cheap initial check and we may add a more expensive check in the future. The current rules are: * Reject if any instruction may load/store (we'd need to check for intervening memory operations. * Reject if any instruction has implicit operands. * Reject if any instruction has unmodelled side-effects. See isObviouslySafeToFold(). Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka Reviewed By: ab Subscribers: igorb, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30539 llvm-svn: 299430
* [mips] Deal with empty blocks in the mips hazard schedulerSimon Dardis2017-04-041-11/+14
| | | | | | | | | | | | This patch teaches the hazard scheduler how to handle empty blocks when search for the next real instruction when dealing with forbidden slots. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D31293 llvm-svn: 299427
* [X86] Add 64 bit pattern matching for PSADBWOren Ben Simhon2017-04-041-13/+41
| | | | | | | | | PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 llvm-svn: 299425
* Reland r298901 with modifications (reverted in r298932)Weiming Zhao2017-04-031-15/+71
| | | | | | | | | | | | | | | | | | | Dont emit Mapping symbols for sections that contain only data. Summary: Dont emit mapping symbols for sections that contain only data. Reviewers: rengolin, weimingz, kparzysz, t.p.northover, peter.smith Reviewed By: t.p.northover Patched by Shankar Easwaran <shankare@codeaurora.org> Subscribers: alekseyshl, t.p.northover, llvm-commits Differential Revision: https://reviews.llvm.org/D30724 llvm-svn: 299392
* AMDGPU: Remove llvm.SI.vs.load.inputMatt Arsenault2017-04-036-19/+0
| | | | llvm-svn: 299391
* [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + ↵Simon Pilgrim2017-04-031-1/+55
| | | | | | | | | | | | | | | | VECTOR_SHUFFLE It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging. This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values. There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch. Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this. Differential Revision: https://reviews.llvm.org/D31373 llvm-svn: 299387
* x86 interrupt calling convention: re-align stack pointer on 64-bit if an ↵Amjad Aboud2017-04-032-2/+18
| | | | | | | | | | | | | | | | error code was pushed The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated. This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue. Fixes Bug 26413 Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D30049 llvm-svn: 299383
* [CodeGenPrep] move aarch64-type-promotion to CGPJun Bum Lim2017-04-033-1/+37
| | | | | | | | | | | | | | | | | Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379
* AMDGPU: Remove legacy bfe intrinsicsMatt Arsenault2017-04-035-37/+14
| | | | llvm-svn: 299372
* [Hexagon] Factor out some common code in HexagonEarlyIfConv.cpp, NFCKrzysztof Parzyszek2017-04-031-12/+10
| | | | llvm-svn: 299367
* [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt ↵Craig Topper2017-04-032-4/+4
| | | | | | | | | | class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362
* ARMAsmParser: clean up of isImmediate functionsSjoerd Meijer2017-04-035-238/+139
| | | | | | | | | | | | | | | | | - we are now using immediate AsmOperands so that the range check functions are tablegen'ed. - Big bonus is that error messages become much more accurate, i.e. instead of a useless "invalid operand" error message it will not say that the immediate operand must in range [x,y], which is why regression tests needed updating. More tablegen operand descriptions could probably benefit from using immediateAsmOperand, but this is a first good step to get rid of most of the nearly identical range check functions. I will address the remaining immediate operands in next clean ups. Differential Revision: https://reviews.llvm.org/D31333 llvm-svn: 299358
* [X86][MMX] Improve support for folding fptosi from XMM to MMXSimon Pilgrim2017-04-021-0/+10
| | | | llvm-svn: 299338
* [X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64Simon Pilgrim2017-04-021-1/+2
| | | | llvm-svn: 299336
* [X86][MMX] Added support for subvector extraction to MMX registerSimon Pilgrim2017-04-021-2/+4
| | | | llvm-svn: 299335
* [AMDGPU] Garbage collect now unused dead code. NFCI.Davide Italiano2017-04-011-10/+0
| | | | llvm-svn: 299310
* Revert "Feature generic option to setup start/stop-after/before"Quentin Colombet2017-04-011-61/+0
| | | | | | | | This reverts commit r299282. Didn't intend to commit this :( llvm-svn: 299288
* Revert "Instrument SDISel C++ patterns"Quentin Colombet2017-04-012-369/+355
| | | | | | | | This reverts commit r299284. Didn't intend to commit this :( llvm-svn: 299286
* Instrument SDISel C++ patternsQuentin Colombet2017-04-012-355/+369
| | | | llvm-svn: 299284
* Feature generic option to setup start/stop-after/beforeQuentin Colombet2017-04-011-0/+61
| | | | | | | | | | | | | This patch refactors the code used in llc such that all the users of the addPassesToEmitFile API have access to a homogeneous way of handling start/stop-after/before options right out of the box. Previously each user would have needed to duplicate this logic and set up its own options. NFC llvm-svn: 299282
* Reduce the number of times we query the subtarget for the same information.Eric Christopher2017-03-311-5/+4
| | | | llvm-svn: 299278
* Small cleanup to remove extraneous cast.Eric Christopher2017-03-311-2/+1
| | | | llvm-svn: 299277
OpenPOWER on IntegriCloud