summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [SDAG] Fix visitAND optimization to deal with vector extract case again.Nirav Dave2017-04-061-0/+22
| | | | | | | | | | | | | | | Summary: Fix case elided by rL298920. Fixes PR32545. Reviewers: eli.friedman, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31759 llvm-svn: 299688
* [ARM] Add Kryo to available targetsYi Kong2017-04-061-0/+1
| | | | | | | | | | | | | | | | Summary: Host CPU detection now supports Kryo, so we need to recognize it in ARM target. Reviewers: mcrosier, t.p.northover, rengolin, echristo, srhines Reviewed By: t.p.northover, echristo Subscribers: aemerson Differential Revision: https://reviews.llvm.org/D31775 llvm-svn: 299674
* [AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront sizeStanislav Mekhanoshin2017-04-062-1/+31
| | | | | | | | | | If a workgroup size is known to be not greater than wavefront size the s_barrier instruction is not needed since all threads are guarantied to come to the same point at the same time. Differential Revision: https://reviews.llvm.org/D31731 llvm-svn: 299659
* [AMDGPU] Resubmit SDWA peephole: enable by defaultSam Kolton2017-04-0642-434/+608
| | | | | | | | | | Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299654
* [X86][MMX] Test showing failure to create MMX non-temporal storeSimon Pilgrim2017-04-061-7/+26
| | | | llvm-svn: 299640
* [ARM] Remove a dead ADD during the creation of TBBsDavid Green2017-04-061-0/+124
| | | | | | | | | During the optimisation of jump tables in the constant island pass, an extra ADD could be left over, now dead but not removed. Differential Revision: https://reviews.llvm.org/D31389 llvm-svn: 299634
* Revert r299536. [AMDGPU] SDWA peephole: enable by default.Ivan Krasin2017-04-0542-608/+434
| | | | | | | | | | | Reason: breaks multiple bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173 Original Review URL: https://reviews.llvm.org/D31671 llvm-svn: 299583
* [Hexagon] Use -mattr to select HVX mode in a testcase, NFCKrzysztof Parzyszek2017-04-051-3/+2
| | | | llvm-svn: 299582
* [DAGCombine] Support FMF contract in fused multiple-and-sub tooAdam Nemet2017-04-051-0/+26
| | | | | | | | | This is a follow-on to r299096 which added support for fmadd. Subtract does not have the case where with two multiply operands we commute in order to fuse with the multiply with the fewer uses. llvm-svn: 299572
* [ARM] Try to re-enable MachineBranchProb.ll for ARM/AArch64Renato Golin2017-04-051-3/+0
| | | | | | | | | Commit r298799 changed code that made the XFAIL on MachineBranchProb.ll irrelevant, but some configurations still failed. I can't reproduce it locally, so I'm hoping that enabling this will tell me if some configurations will really fail or if they were just too slow. llvm-svn: 299558
* [SystemZ] Prevent Merging Bitcast with non-normal loadsNirav Dave2017-04-051-0/+20
| | | | | | | | | | | | Fixes PR32505. Reviewers: uweigand, jonpa Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31609 llvm-svn: 299552
* [DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to ↵Sanjay Patel2017-04-054-35/+29
| | | | | | | | | | | | | | | | | bitwise logic+setcc (PR32401) This is a generic combine enabled via target hook to reduce icmp logic as discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 It's likely that other targets will want to enable this hook for scalar transforms, and there are probably other patterns that can use bitwise logic to reduce comparisons. Note that we are missing an IR canonicalization for these patterns, and we will probably prefer the pair-of-compares form in IR (shorter, more likely to fold). Differential Revision: https://reviews.llvm.org/D31483 llvm-svn: 299542
* [DAGCombiner] Don't make a BUILD_VECTOR with operands of illegal type.Jonas Paulsson2017-04-051-0/+26
| | | | | | | | | | | | | | | | | | When DAGCombiner visits a SIGN_EXTEND_INREG of a BUILD_VECTOR with constant operands, a new BUILD_VECTOR node will be created transformed constants. Llvm-stress found a case where the new BUILD_VECTOR had constant operands of an illegal type, because the (legal) element type is in fact not a legal scalar type. This patch changes this so that the new BUILD_VECTOR has the same operand type as the old one. Review: Eli Friedman, Nirav Dave https://bugs.llvm.org//show_bug.cgi?id=32422 llvm-svn: 299540
* [AMDGPU] SDWA peephole: enable by defaultSam Kolton2017-04-0542-434/+608
| | | | | | | | | | Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299536
* [X86] Relax assert in broadcast-of-subvector lowering.Ahmed Bougacha2017-04-052-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before r294774, there was a problem when lowering broadcasts to use 128-bit subvectors. When we looked through a bitcast to find the broadcast input, we'd keep using the original type, so you'd end up with things like: (v8f32 (broadcast (v4f32 (extract_subvector (v8i32 V), ...)) )) r294774 fixed it to always emit subvectors with the scalar type of the original source. It also introduced some asserts, to check that we use scalars with the same size, and vectors with the same number of elements. The scalar size equality is checked earlier when looking through bitcasts, and is a useful assert. However, the number of elements don't have to be identical: we're always going to extract a 128-bit subvector, and we can have different size inputs if we looked through a concat_vector to find a 256-bit source. Relax the overzealous assert. Replace it with a check of the original source vector being 256 or 512 bits. If it's 128 bits, we can't extract_subvector from it. Fixes PR32371. llvm-svn: 299490
* [AArch64] Avoid partial register deps on insertelt of load into lane 0.Ahmed Bougacha2017-04-041-5/+5
| | | | | | | | | | | | | | | This improves upon r246462: that prevented FMOVs from being emitted for the cross-class INSERT_SUBREGs by disabling the formation of INSERT_SUBREGs of LOAD. But the ld1.s that we started selecting caused us to introduce partial dependencies on the vector register. Avoid that by using SCALAR_TO_VECTOR: it's a first-class citizen that is folded away by many patterns, including the scalar LDRS that we want in this case. Credit goes to Adam for finding the issue! llvm-svn: 299482
* Change section flag character for SHF_LINK_ORDER to "o".Evgeniy Stepanov2017-04-041-8/+8
| | | | | | | | GAS uses "m" as a compatibility alias for "M" (SHF_MERGE). "o" is free, except on ia64, where it already means SHF_LINK_ORDER. llvm-svn: 299479
* [AArch64][Fuchsia] Allow -mcmodel=kernel for --target=aarch64-fuchsiaPetr Hosek2017-04-043-7/+17
| | | | | | | | | | | This mode is just like -mcmodel=small except that it moves the thread pointer from TPIDR_EL0 to TPIDR_EL1. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31624 llvm-svn: 299462
* Verifier: Check some amdgpu calling convention restrictionsMatt Arsenault2017-04-041-2/+3
| | | | llvm-svn: 299457
* AMDGPU: Remove legacy export intrinsicMatt Arsenault2017-04-042-240/+4
| | | | llvm-svn: 299444
* AMDGPU: Remove legacy image intrinsicsMatt Arsenault2017-04-047-1333/+6
| | | | llvm-svn: 299443
* [X86][LLVM] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into ↵Michael Zuckerman2017-04-048-158/+157
| | | | | | | | | | | generic intrinsics. This patch is a part one of two reviews, one for the clang and the other for LLVM. The patch deletes the back-end intrinsics and adds support for them in the auto upgrade. Differential Revision: https://reviews.llvm.org/D31393 llvm-svn: 299432
* [tablegen][globalisel] Add support for nested instruction matching.Daniel Sanders2017-04-042-26/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Lift the restrictions that prevented the tree walking introduced in the previous change and add support for patterns like: (G_ADD (G_MUL (G_SEXT $src1), (G_SEXT $src2)), $src3) -> SMADDWrrr $dst, $src1, $src2, $src3 Also adds support for G_SEXT and G_ZEXT to support these cases. One particular aspect of this that I should draw attention to is that I've tried to be overly conservative in determining the safety of matches that involve non-adjacent instructions and multiple basic blocks. This is intended to be used as a cheap initial check and we may add a more expensive check in the future. The current rules are: * Reject if any instruction may load/store (we'd need to check for intervening memory operations. * Reject if any instruction has implicit operands. * Reject if any instruction has unmodelled side-effects. See isObviouslySafeToFold(). Reviewers: t.p.northover, javed.absar, qcolombet, aditya_nandakumar, ab, rovka Reviewed By: ab Subscribers: igorb, dberris, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30539 llvm-svn: 299430
* [mips] Deal with empty blocks in the mips hazard schedulerSimon Dardis2017-04-041-0/+92
| | | | | | | | | | | | This patch teaches the hazard scheduler how to handle empty blocks when search for the next real instruction when dealing with forbidden slots. Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D31293 llvm-svn: 299427
* [X86] Add 64 bit pattern matching for PSADBWOren Ben Simhon2017-04-041-0/+347
| | | | | | | | | PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 llvm-svn: 299425
* add/move codegen tests for and/or of setcc; NFCSanjay Patel2017-04-035-73/+173
| | | | llvm-svn: 299396
* AMDGPU: Remove llvm.SI.vs.load.inputMatt Arsenault2017-04-032-11/+18
| | | | llvm-svn: 299391
* DAG: Fix missing legalization for any_extend_vector_inreg operandsMatt Arsenault2017-04-031-0/+58
| | | | llvm-svn: 299389
* [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + ↵Simon Pilgrim2017-04-0311-102/+87
| | | | | | | | | | | | | | | | VECTOR_SHUFFLE It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging. This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values. There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch. Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this. Differential Revision: https://reviews.llvm.org/D31373 llvm-svn: 299387
* x86 interrupt calling convention: re-align stack pointer on 64-bit if an ↵Amjad Aboud2017-04-032-8/+12
| | | | | | | | | | | | | | | | error code was pushed The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated. This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue. Fixes Bug 26413 Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D30049 llvm-svn: 299383
* [CodeGenPrep] move aarch64-type-promotion to CGPJun Bum Lim2017-04-032-4/+75
| | | | | | | | | | | | | | | | | Summary: Move the aarch64-type-promotion pass within the existing type promotion framework in CGP. This change also support forking sexts when a new sext is required for promotion. Note that change is based on D27853 and I am submitting this out early to provide a better idea on D27853. Reviewers: jmolloy, mcrosier, javed.absar, qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, aemerson, rengolin, mcrosier Differential Revision: https://reviews.llvm.org/D28680 llvm-svn: 299379
* AMDGPU: Remove legacy bfe intrinsicsMatt Arsenault2017-04-035-1215/+145
| | | | llvm-svn: 299372
* Revert "[DAGCombine] A shuffle of a splat is always the splat itself"Zvi Rackover2017-04-031-6/+10
| | | | | | | | | | This reverts commit r299047 which is incorrect because the simplification may result in incorrect propogation of undefs to users of the folded shuffle. Thanks to Andrea Di Biagio for pointing this out. llvm-svn: 299368
* [X86][MMX] Improve support for folding fptosi from XMM to MMXSimon Pilgrim2017-04-021-7/+3
| | | | llvm-svn: 299338
* [X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64Simon Pilgrim2017-04-021-4/+2
| | | | llvm-svn: 299336
* [X86][MMX] Added support for subvector extraction to MMX registerSimon Pilgrim2017-04-021-5/+3
| | | | llvm-svn: 299335
* Regenerate test with codegen. NFCI.Simon Pilgrim2017-04-021-4/+10
| | | | llvm-svn: 299333
* Regenerate test with codegen. NFCI.Simon Pilgrim2017-04-021-4/+89
| | | | llvm-svn: 299332
* Regenerate test. NFCI.Simon Pilgrim2017-04-021-56/+56
| | | | llvm-svn: 299331
* [X86][MMX] Add generic fptosi 4f32-4i32 testSimon Pilgrim2017-04-021-0/+39
| | | | llvm-svn: 299328
* [DAGCombiner] enable vector transforms for any/all {sign} bits set/clearSanjay Patel2017-04-012-59/+37
| | | | | | | | The code already allowed vector types in via "isInteger" (which might want a more specific name), so use splat-friendly constant predicates to match those types. llvm-svn: 299304
* [PowerPC, x86] add vector tests for any/all {sign} bits set/clear; NFCSanjay Patel2017-04-012-0/+237
| | | | llvm-svn: 299303
* [DAGCombiner] Fix fold (or (shuf A, V_0, MA), (shuf B, V_0, MB)) -> (shuf A, ↵Craig Topper2017-04-011-0/+32
| | | | | | | | | | B, Mask) to explicitly ensure that only one of the inputs of each shuffle is a zero vector. This can only happen when we have a mix of zero and undef elements and the two vectors have a different arrangement of zeros/undefs. The shuffle should eventually be constant folded to all zeros. Fixes PR32484. llvm-svn: 299291
* Revert "Feature generic option to setup start/stop-after/before"Quentin Colombet2017-04-011-4/+4
| | | | | | | | This reverts commit r299282. Didn't intend to commit this :( llvm-svn: 299288
* Revert "Localizer fun"Quentin Colombet2017-04-011-260/+0
| | | | | | | | This reverts commit r299283. Didn't intend to commit this :( llvm-svn: 299287
* [RegBankSelect] Support REG_SEQUENCE for generic mappingQuentin Colombet2017-04-011-0/+25
| | | | | | | | | | | | | | | | REG_SEQUENCE falls into the same category as COPY for operands mapping: - They don't have MCInstrDesc with register constraints - The input variable could use whatever register classes - It is possible to have register class already assigned to the operands In particular, given REG_SEQUENCE are always target specific because of the subreg indices. Those indices must apply to the register class of the definition of the REG_SEQUENCE and therefore, the target must set a register class to that definition. As a result, the generic code can always use that register class to derive a valid mapping for a REG_SEQUENCE. llvm-svn: 299285
* Localizer funQuentin Colombet2017-04-011-0/+260
| | | | | | WIP llvm-svn: 299283
* Feature generic option to setup start/stop-after/beforeQuentin Colombet2017-04-011-4/+4
| | | | | | | | | | | | | This patch refactors the code used in llc such that all the users of the addPassesToEmitFile API have access to a homogeneous way of handling start/stop-after/before options right out of the box. Previously each user would have needed to duplicate this logic and set up its own options. NFC llvm-svn: 299282
* [Hexagon] Fix typo in HexagonEarlyIfCConv.cppKrzysztof Parzyszek2017-03-311-1/+1
| | | | | | Found by PVS-Studio. Fixes llvm.org/PR32480. llvm-svn: 299258
* [DAGCombiner] add fold for 'All sign bits set?'Sanjay Patel2017-03-312-14/+8
| | | | | | | | | | (and (setlt X, 0), (setlt Y, 0)) --> (setlt (and X, Y), 0) We have 7 similar folds, but this one got away. The fact that the x86 test with a branch didn't change is probably a separate bug. We may also be missing this and the related folds in instcombine. llvm-svn: 299252
OpenPOWER on IntegriCloud