summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Fix folding reg_sequence into copy to phys regMatt Arsenault2017-04-111-0/+13
| | | | | | | This was producing an illegal reg_sequence defining a physical register with virtual register inputs. llvm-svn: 299997
* [DAGCombine] Add more test cases for shuffle of splat. NFC.Zvi Rackover2017-04-111-0/+56
| | | | | | Tests added contain splat-masks with undef elements. llvm-svn: 299988
* [x86] Relax the check in areLoadsFromSameBasePtrEaswaran Raman2017-04-111-4/+4
| | | | | | | | | Check if the scale operand is identical (doesn't have to be 1) and do not check the chaain operand. Differential revision: https://reviews.llvm.org/D31833 llvm-svn: 299986
* MIR: Allow parsing of empty machine functionsJustin Bogner2017-04-118-41/+17
| | | | | | | | | | | | If you run llc -stop-after=codegenprepare and feed the resulting MIR to llc -start-after=codegenprepare, you'll have an empty machine function since we haven't run any isel yet. Of course, this only works if the MIRParser believes you that this is okay. This is essentially a revert of r241862 with a fix for the problem it was papering over. llvm-svn: 299975
* [X86] Create the correct ADC/SBB SDNode when lowering add.Davide Italiano2017-04-111-0/+27
| | | | | | Differential Revision: https://reviews.llvm.org/D31911 llvm-svn: 299973
* [AMDGPU] Add A5 to data layout for amdgiz environmentYaxun Liu2017-04-112-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D31589 llvm-svn: 299964
* GlobalISel: Allow legalizing G_FADD to a libcallDiana Picus2017-04-112-6/+112
| | | | | | | | | Use the same handling in the generic legalizer code as for the other libcalls (G_FREM, G_FPOW). Enable it on ARM for float and double so we can test it. llvm-svn: 299931
* [GlobalISel] LegalizerInfo: Enable legalization of non-power-of-2 typesVolkan Keles2017-04-112-0/+37
| | | | | | | | | | | | | | Summary: Legalize only if the type is marked as Legal or Custom. If not, return Unsupported as LegalizerHelper is not able to handle non-power-of-2 types right now. Reviewers: qcolombet, aditya_nandakumar, dsanders, t.p.northover, kristof.beyls, javed.absar, ab Reviewed By: kristof.beyls, ab Subscribers: dberris, rovka, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D31711 llvm-svn: 299929
* [SelectionDAG] Check CALLSEQ_BEGIN nodes in DelayForLiveRegsSam Parker2017-04-111-0/+136
| | | | | | | | | | | | | | | | | A fix for the bug reported in PR30911. The issue arises when multiple CALLSEQ_BEGIN nodes are unscheduled as the last node to be unscheduled will gain access to the CallResource register. But when a node is being picked, only CALLSEQ_END nodes are checked against the CallResource and have their chains evaluated. This then means that other CALLSEQ_BEGIN nodes can be scheduled before the existing call sequence has been finalised. This patch adds a check against the FrameSetup nodes in DelayForLiveRegs to prevent this from happening. Differential Revision: https://reviews.llvm.org/D31536 llvm-svn: 299926
* [PowerPC] multiply-with-overflow might use the CTR registerHal Finkel2017-04-111-0/+34
| | | | | | | | | | | | Check the legality of ISD::[US]MULO to see whether Intrinsic::[us]mul_with_overflow will legalize into a function call (and, thus, will use the CTR register). Fixes PR32485. Patch by Tim Neumann! Differential Revision: https://reviews.llvm.org/D31790 llvm-svn: 299910
* [ARM, x86] add tests to show possible improvement for bool math; NFCSanjay Patel2017-04-102-0/+64
| | | | llvm-svn: 299897
* CodeGen: BlockPlacement: Don't always tail-duplicate with no other successor.Kyle Butt2017-04-102-2/+55
| | | | | | | | | | The math works out where it can actually be counter-productive. The probability calculations correctly handle the case where the alternative is 0 probability, rely on those calculations. Includes a test case that demonstrates the problem. llvm-svn: 299892
* CodeGen: BlockPlacement: Minor probability changes.Kyle Butt2017-04-102-1/+42
| | | | | | | Qin may be large, and Succ may be more frequent than BB. Take these both into account when deciding if tail-duplication is profitable. llvm-svn: 299891
* CodeGen: BranchFolding: Merge identical blocks, even if they are short.Kyle Butt2017-04-101-0/+41
| | | | | | | | Merging identical blocks when it doesn't reduce fallthrough. It is common for the blocks created from critical edge splitting to be identical. We would like to merge these blocks whenever doing so would not reduce fallthrough. llvm-svn: 299890
* Add address space mangling to lifetime intrinsicsMatt Arsenault2017-04-1053-355/+355
| | | | | | In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876
* [X86][MMX] Add fast-isel support for MMX non-temporal writesSimon Pilgrim2017-04-101-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D31754 llvm-svn: 299852
* [ARM] GlobalISel: Support G_FPOW for float and doubleDiana Picus2017-04-102-3/+114
| | | | | | Legalize to a libcall. llvm-svn: 299841
* AMDGPU: Actually write nops for writeNopDataMatt Arsenault2017-04-081-0/+87
| | | | | | | Before this was just writing 0s, which ends up looking like a v_cndmask_b32 v0, s0, v0, vcc. Write out an encoded s_nop instead. llvm-svn: 299816
* [ARM] Prefer BIC over BFC in ARM mode.Eli Friedman2017-04-077-19/+25
| | | | | | | | | | | | BIC is generally faster, and it can put the output in a different register from the input. We already do this in Thumb2 mode; not sure why the equivalent fix never got applied to ARM mode. Differential Revision: https://reviews.llvm.org/D31797 llvm-svn: 299803
* [GlobalISel]: Fix bug where we can report GISelFailure on erased instructionsAditya Nandakumar2017-04-071-0/+8
| | | | | | | | | | | | | The original instruction might get legalized and erased and expanded into intermediate instructions and the intermediate instructions might fail legalization. This end up in reporting GISelFailure on the erased instruction. Instead report GISelFailure on the intermediate instruction which failed legalization. Reviewed by: ab llvm-svn: 299802
* [AArch64] Allow global register asm("x18") or asm("w18") under -ffixed-x18Petr Hosek2017-04-072-0/+28
| | | | | | | | | | | | When using -ffixed-x18, the x18 (or w18) register can safely be used with the "global register variable" GCC extension, but the backend fails to recognize it. Patch by Roland McGrath. Differential Revision: https://reviews.llvm.org/D31793 llvm-svn: 299799
* Revert "[SelectionDAG] Enable target specific vector scalarization of calls ↵Simon Dardis2017-04-074-1697/+24
| | | | | | | | | | | | | and returns" This reverts commit r299766. This change appears to have broken the MIPS buildbots. Reverting while I investigate. Revert "[mips] Remove usage of debug only variable (NFC)" This reverts commit r299769. Follow up commit. llvm-svn: 299788
* [AMDGPU] Unroll more to eliminate phis and conditionsStanislav Mekhanoshin2017-04-071-0/+34
| | | | | | | | | | | | | Increase threshold to unroll a loop which contains an "if" statement whose condition defined by a PHI belonging to the loop. This may help to eliminate if region and potentially even PHI itself, saving on both divergence and registers used for the PHI. Add a small bonus for each of such "if" statements. Differential Revision: https://reviews.llvm.org/D31693 llvm-svn: 299779
* Use PMADDWD to expand reduction in a loopDehao Chen2017-04-071-0/+103
| | | | | | | | | | | | | | | | | | Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 llvm-svn: 299776
* [GlobalISel] implement narrowing for G_CONSTANT.Igor Breger2017-04-071-0/+43
| | | | | | | | | | | | | | Summary: [GlobalISel] implement narrowing for G_CONSTANT. Reviewers: bogner, zvi, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31744 llvm-svn: 299772
* [mips][msa] Fix generation of bm(n)zi and bins[lr]i instructionsPetar Jovanovic2017-04-074-13/+68
| | | | | | | | | | | | | | | | | | | | | | | | | We have two cases here, the first one being the following instruction selection from the builtin function: bm(n)zi builtin -> vselect node -> bins[lr]i machine instruction In case of bm(n)zi having an immediate which has either its high or low bits set, a bins[lr] instruction can be selected through the selectVSplatMask[LR] function. The function counts the number of bits set, and that value is being passed to the bins[lr]i instruction as its immediate, which in turn copies immediate modulo the size of the element in bits plus 1 as per specs, where we get the off-by-one-error. The other case is: bins[lr]i -> vselect node -> bsel.v In this case, a bsel.v instruction gets selected with a mask having one bit less set than required. Patch by Stefan Maksimovic. Differential Revision: https://reviews.llvm.org/D30579 llvm-svn: 299768
* [SelectionDAG] Enable target specific vector scalarization of calls and returnsSimon Dardis2017-04-074-24/+1697
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 299766
* [SystemZ] Check for presence of vector support in SystemZISelLoweringJonas Paulsson2017-04-071-0/+18
| | | | | | | | | | | | | | A test case was found with llvm-stress that caused DAGCombiner to crash when compiling for an older subtarget without vector support. SystemZTargetLowering::combineTruncateExtract() should do nothing for older subtargets. This check was placed in canTreatAsByteVector(), which also helps in a few other places. Review: Ulrich Weigand llvm-svn: 299763
* [ARM] GlobalISel: Test hard float properlyDiana Picus2017-04-071-16/+26
| | | | | | | | It turns out -float-abi=hard doesn't set the hard float calling convention for libcalls. We need to use a hard float triple instead (e.g. gnueabihf). llvm-svn: 299761
* [AMDGPU] Move SiShrinkInstruction and SDWAPeephole to SSAOptimization passesSam Kolton2017-04-072-2/+2
| | | | | | | | | | | | | | Summary: Difference beetween PreRegAlloc() and MachineSSAOptimization() are that the former is run despite of -O0 optimization level. In my undestanding SiShrinkInstructions and SDWAPeephole shouldn't run when optimizations are disabled. With this change order of passes will not change. Reviewers: arsenm, vpykhtin, rampitec Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31705 llvm-svn: 299757
* [ARM] GlobalISel: Support frem for 64-bit valuesDiana Picus2017-04-072-0/+58
| | | | | | Legalize to a libcall. llvm-svn: 299756
* [ARM] GlobalISel: Support frem for 32-bit valuesDiana Picus2017-04-072-0/+48
| | | | | | | | Legalize to a libcall. On this occasion, also start allowing soft float subtargets. For the moment G_FREM is the only legal floating point operation for them. llvm-svn: 299753
* AMDGPU/GFX9: Fix shared and private aperture queriesKonstantin Zhuravlyov2017-04-061-3/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D31786 llvm-svn: 299727
* Turn on -addr-sink-using-gep by default.Eli Friedman2017-04-0612-68/+47
| | | | | | | | | The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 llvm-svn: 299723
* [X86] Revert r299387 due to AVX legalization infinite loop.Michael Kuperstein2017-04-0611-87/+102
| | | | llvm-svn: 299720
* AMDGPU: Diagnose illegal SGPR to VGPR copiesMatt Arsenault2017-04-061-0/+45
| | | | | | | | | | This is possible in ways that are not compiler bugs, so stop asserting on them. This emits an extra error when emitting objects when it can't encode the new pseudo, but I'm not sure that matters. llvm-svn: 299712
* AMDGPU: Replace fp16SrcZerosHighBits with a whitelistMatt Arsenault2017-04-061-20/+21
| | | | | | | FCOPYSIGN is lowered to bit operations which don't clear the high bits. llvm-svn: 299708
* [SelectionDAG] [ARM CodeGen] Fix chain information of LowerMULHuihui Zhang2017-04-061-0/+115
| | | | | | | | | | | | | | In LowerMUL, the chain information is not preserved for the new created Load SDNode. For example, if a Store alias with one of the operand of Mul. The Load for that operand need to be scheduled before the Store. The dependence is recorded in the chain of Store, in TokenFactor. However, when lowering MUL, the SDNodes for the new Loads for VMULL are not updated in the TokenFactor for the Store. Thus the chain is not preserved for the lowered VMULL. llvm-svn: 299701
* Revert "[ARM] Add Kryo to available targets"Yi Kong2017-04-061-1/+0
| | | | | | | | This reverts commit 942d6e6f58bf7e63810dd7cbcbce1fdfa5ebc6d4. Build breakage. llvm-svn: 299689
* [SDAG] Fix visitAND optimization to deal with vector extract case again.Nirav Dave2017-04-061-0/+22
| | | | | | | | | | | | | | | Summary: Fix case elided by rL298920. Fixes PR32545. Reviewers: eli.friedman, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31759 llvm-svn: 299688
* [ARM] Add Kryo to available targetsYi Kong2017-04-061-0/+1
| | | | | | | | | | | | | | | | Summary: Host CPU detection now supports Kryo, so we need to recognize it in ARM target. Reviewers: mcrosier, t.p.northover, rengolin, echristo, srhines Reviewed By: t.p.northover, echristo Subscribers: aemerson Differential Revision: https://reviews.llvm.org/D31775 llvm-svn: 299674
* [AMDGPU] Eliminate barrier if workgroup size is not greater than wavefront sizeStanislav Mekhanoshin2017-04-062-1/+31
| | | | | | | | | | If a workgroup size is known to be not greater than wavefront size the s_barrier instruction is not needed since all threads are guarantied to come to the same point at the same time. Differential Revision: https://reviews.llvm.org/D31731 llvm-svn: 299659
* [AMDGPU] Resubmit SDWA peephole: enable by defaultSam Kolton2017-04-0642-434/+608
| | | | | | | | | | Reviewers: vpykhtin, rampitec, arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D31671 llvm-svn: 299654
* [X86][MMX] Test showing failure to create MMX non-temporal storeSimon Pilgrim2017-04-061-7/+26
| | | | llvm-svn: 299640
* [ARM] Remove a dead ADD during the creation of TBBsDavid Green2017-04-061-0/+124
| | | | | | | | | During the optimisation of jump tables in the constant island pass, an extra ADD could be left over, now dead but not removed. Differential Revision: https://reviews.llvm.org/D31389 llvm-svn: 299634
* Revert r299536. [AMDGPU] SDWA peephole: enable by default.Ivan Krasin2017-04-0542-608/+434
| | | | | | | | | | | Reason: breaks multiple bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/3988 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/1173 Original Review URL: https://reviews.llvm.org/D31671 llvm-svn: 299583
* [Hexagon] Use -mattr to select HVX mode in a testcase, NFCKrzysztof Parzyszek2017-04-051-3/+2
| | | | llvm-svn: 299582
* [DAGCombine] Support FMF contract in fused multiple-and-sub tooAdam Nemet2017-04-051-0/+26
| | | | | | | | | This is a follow-on to r299096 which added support for fmadd. Subtract does not have the case where with two multiply operands we commute in order to fuse with the multiply with the fewer uses. llvm-svn: 299572
* [ARM] Try to re-enable MachineBranchProb.ll for ARM/AArch64Renato Golin2017-04-051-3/+0
| | | | | | | | | Commit r298799 changed code that made the XFAIL on MachineBranchProb.ll irrelevant, but some configurations still failed. I can't reproduce it locally, so I'm hoping that enabling this will tell me if some configurations will really fail or if they were just too slow. llvm-svn: 299558
* [SystemZ] Prevent Merging Bitcast with non-normal loadsNirav Dave2017-04-051-0/+20
| | | | | | | | | | | | Fixes PR32505. Reviewers: uweigand, jonpa Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31609 llvm-svn: 299552
OpenPOWER on IntegriCloud