summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [MIPS GlobalISel] Remove switch statement (fix r349346 for MSVC)Petar Avramovic2018-12-171-6/+1
| | | | | | | Temporarily remove switch statement without any case labels in function legalizeCustom in order to fix r349346 for MSVC. llvm-svn: 349356
* ARM: use acquire/release instruction variants when available.Tim Northover2018-12-172-8/+9
| | | | | | | | These features (fairly) recently got split out into their own feature, so we should make CodeGen use them when available. The main change here is that the check used to be based on the triple, but now it's based on CPU features. llvm-svn: 349355
* [MIPS GlobalISel] Lower G_UADDE and narrowScalar G_ADDPetar Avramovic2018-12-171-30/+5
| | | | | | | | Lower G_UADDE and legalize G_ADD using narrowScalar on MIPS32. Differential Revision: https://reviews.llvm.org/D54580 llvm-svn: 349346
* [AArch64] Re-run load/store optimizer after aggressive tail duplicationAlexandros Lamprineas2018-12-171-0/+6
| | | | | | | | | The Load/Store Optimizer runs before Machine Block Placement. At O3 the Tail Duplication Threshold is set to 4 instructions and this can create new opportunities for the Load/Store Optimizer. It seems worthwhile to run it once again. llvm-svn: 349338
* [X86] Fix bad operand lookup for cmov introduced in r349315Craig Topper2018-12-171-1/+1
| | | | | | The CC is operand 2 not operand 3. llvm-svn: 349330
* [X86] Pull out constant splat rotation detection.Simon Pilgrim2018-12-161-21/+28
| | | | | | We had 3 different approaches - consistently use getTargetConstantBitsFromNode and allow undef elts. llvm-svn: 349319
* [X86] Remove truncation handling from EmitTest. Replace it with a DAG combine.Craig Topper2018-12-161-50/+105
| | | | | | | | | | I'd like to try to move a lot of the flag matching out of EmitTest and push it to isel or isel preprocessing. This is a step towards that. The test-shrink-bug.ll changie is an improvement because we are no longer interfering with test shrink handling in isel. The pr34137.ll change is a regression, but the IR came from -O0 and was not reduced by InstCombine. So it contains a lot of redundancies like duplicate loads that made it combine poorly. llvm-svn: 349315
* [x86] increment/decrement constant vector with min/max in vsetcc lowering ↵Sanjay Patel2018-12-161-3/+16
| | | | | | | | | | | | | | | | | | | | (PR39859) This is part of fixing PR39859: https://bugs.llvm.org/show_bug.cgi?id=39859 We have a crippled vector ISA, so we have to invert a typical fold and create min/max here. As discussed in the bug report, we can probably do better by using saturating subtract when it's available, but we should have this improvement for the min/max patterns regardless. Alive proofs: https://rise4fun.com/Alive/zsf https://rise4fun.com/Alive/Qrl Differential Revision: https://reviews.llvm.org/D55515 llvm-svn: 349304
* [X86] Begin cleaning up combineOr -> SHLD/SHRD. NFCI.Simon Pilgrim2018-12-151-5/+5
| | | | | | In preparation for converting to funnel shifts. llvm-svn: 349286
* [X86] Lower to SHLD/SHRD on slow machines for optsizeSimon Pilgrim2018-12-151-3/+3
| | | | | | Use consistent rules for when to lower to SHLD/SHRD for slow machines - fixes a weird issue where funnel shift gets expanded but then X86ISelLowering's combineOr sees the optsize and combines to SHLD/SHRD, but now with the modulo amount guard...... llvm-svn: 349285
* Fix -Wunused-variable warning. NFCI.Simon Pilgrim2018-12-151-0/+4
| | | | llvm-svn: 349265
* [SILoadStoreOptimizer] Use std::abs to avoid truncation.Florian Hahn2018-12-151-2/+2
| | | | | | | | | | | | | | Using regular abs() causes the following warning error: absolute value function 'abs' given an argument of type 'int64_t' (aka 'long') but has parameter of type 'int' which may cause truncation of value [-Werror,-Wabsolute-value] (uint32_t)abs(Dist) > MaxDist) { ^ lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:1369:19: note: use function 'std::abs' instead which causes a bot to fail: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18284/steps/bootstrap%20clang/logs/stdio llvm-svn: 349224
* [X86] Rename hasNoSignedComparisonUses to hasNoSignFlagUses. Add the ↵Craig Topper2018-12-151-8/+14
| | | | | | | | | | instruction that only modify the O flag to the waiver list. The only caller of this turns CMP with 0 into TEST. CMP with 0 and TEST both set OF to 0 so we should have no issues with instructions that only use OF. Though I don't think there's any reason we would read just OF after a compare with 0 anyway. So this probably isn't an observable change. llvm-svn: 349223
* [X86] Make hasNoCarryFlagUses/hasNoSignedComparisonUses take an SDValue that ↵Craig Topper2018-12-151-20/+19
| | | | | | | | | | | | indicates which result is the flag result. NFCI hasNoCarryFlagUses hardcoded that the flag result is 1 and used that to filter which uses were of interest. hasNoSignedComparisonUses just assumes the only result is flags and checks whether any user of the node is a CopyToReg instruction. After this patch we now do a result number check in both and rely on the caller to provide the result number. This shouldn't change behavior it was just an odd difference between the two functions that I noticed. llvm-svn: 349222
* [NVPTX] Lower instructions that expand into libcalls.Artem Belevich2018-12-141-4/+9
| | | | | | | | | | | | | | | | | | | The change is an effort to split and refactor abandoned D34708 into smaller parts. Here the behaviour of unsupported instructions is changed to match the behaviour of explicit intrinsics calls. Currently LLVM crashes with: > Assertion getInstruction() && "Not a call or invoke instruction!" failed. With this patch LLVM produces a more sensible error message: > Cannot select: ... i32 = ExternalSymbol'__foobar' Author: Denys Zariaiev <denys.zariaiev@gmail.com> Differential Revision: https://reviews.llvm.org/D55145 llvm-svn: 349213
* [Hexagon] Add patterns for shifts of v2i16Krzysztof Parzyszek2018-12-141-0/+12
| | | | | | This fixes https://llvm.org/PR39983. llvm-svn: 349202
* [Hexagon] Use IMPLICIT_DEF to any-extend 32-bit values to 64 bitsKrzysztof Parzyszek2018-12-141-23/+25
| | | | llvm-svn: 349199
* [AMDGPU] Promote constant offset to the immediate by finding a new base with ↵Farhana Aleen2018-12-142-1/+362
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 13bit constant offset from the nearby instructions. Summary: Promote constant offset to immediate by recomputing the relative 13bit offset from nearby instructions. E.g. s_movk_i32 s0, 0x1800 v_add_co_u32_e32 v0, vcc, s0, v2 v_addc_co_u32_e32 v1, vcc, 0, v6, vcc s_movk_i32 s0, 0x1000 v_add_co_u32_e32 v5, vcc, s0, v2 v_addc_co_u32_e32 v6, vcc, 0, v6, vcc global_load_dwordx2 v[5:6], v[5:6], off global_load_dwordx2 v[0:1], v[0:1], off => s_movk_i32 s0, 0x1000 v_add_co_u32_e32 v5, vcc, s0, v2 v_addc_co_u32_e32 v6, vcc, 0, v6, vcc global_load_dwordx2 v[5:6], v[5:6], off global_load_dwordx2 v[0:1], v[5:6], off offset:2048 Author: FarhanaAleen Reviewed By: arsenm, rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D55539 llvm-svn: 349196
* [AArch64] Simplify the scheduling predicates (NFC)Evandro Menezes2018-12-142-17/+21
| | | | | | | The instruction encodings make it unnecessary to distinguish extended W-form from X-form instructions. llvm-svn: 349185
* NFC. Adding an empty line to test the updated commit credentials.Ehsan Amiri2018-12-141-0/+1
| | | | llvm-svn: 349158
* [ARM GlobalISel] Thumb2: casts between int and ptrDiana Picus2018-12-141-3/+3
| | | | | | Mark as legal and add tests. Nothing special to do. llvm-svn: 349147
* [ARM GlobalISel] Minor refactoring. NFCIDiana Picus2018-12-141-43/+84
| | | | | | | | Refactor the ARMInstructionSelector to cache some opcodes in the constructor instead of checking all the time if we're in ARM or Thumb mode. llvm-svn: 349143
* [ARM GlobalISel] Allow simple binary ops in Thumb2Diana Picus2018-12-141-4/+4
| | | | | | | | | | | Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM and Thumb2. Extract the legalizer tests for these opcodes into another file. Add tests for the instruction selector. llvm-svn: 349142
* [DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext ↵Craig Topper2018-12-141-0/+16
| | | | | | | | | | | | | | | | | (setcc)) already has the target desired type for the setcc Summary: If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node. To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55459 llvm-svn: 349137
* [X86] Demote EmitTest to a helper function of EmitCmp. Route all callers ↵Craig Topper2018-12-132-14/+9
| | | | | | | | | | except EmitCmp through EmitCmp. This requires the two callers to manifest a 0 to make EmitCmp call EmitTest. I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely. llvm-svn: 349094
* [AArch64] Fix Exynos predicates (NFC)Evandro Menezes2018-12-131-14/+23
| | | | | | | | Fix the logic in the definition of the `ExynosShiftExPred` as a more specific version of `ExynosShiftPred`. But, since `ExynosShiftExPred` is not used yet, this change has NFC. llvm-svn: 349091
* Revert r348971: [AMDGPU] Support for "uniform-work-group-size" attributeAakanksha Patil2018-12-132-65/+7
| | | | | | This patch breaks RADV (and probably RadeonSI as well) llvm-svn: 349084
* AMDGPU/GlobalISel: Legalize/regbankselect block_addrMatt Arsenault2018-12-132-1/+6
| | | | llvm-svn: 349081
* [llvm] Address base discriminator overflow in X86DiscriminateMemOpsMircea Trofin2018-12-131-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Macros are expanded on a single line. In case of large expansions, with sufficiently many instructions with memory operands (and when -fdebug-info-for-profiling is requested), we may be unable to generate new base discriminator values - new values overflow (base discriminators may not be larger than 2^12). This CL warns instead of asserting in such a case. A subsequent CL will add APIs to check for overflow before creating new debug info. See https://bugs.llvm.org/show_bug.cgi?id=39890 Reviewers: davidxl, wmi, gbedwell Reviewed By: davidxl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55643 llvm-svn: 349075
* [X86][SSE] Add SSE vector imm/var shift support to ↵Simon Pilgrim2018-12-131-0/+15
| | | | | | SimplifyDemandedVectorEltsForTargetNode llvm-svn: 349057
* [X86][SSE] Fix all remaining modulo vector rotation amounts (PR38243)Simon Pilgrim2018-12-131-9/+6
| | | | | | There's still a couple of minor SimplifyDemandedElts regressions in some of the shift amount splats that will be fixed in future patches. llvm-svn: 349052
* [Sparc] Add membar assembler tagsDaniel Cederman2018-12-134-1/+91
| | | | | | | | | | | | | | | | | Summary: The Sparc V9 membar instruction can enforce different types of memory orderings depending on the value in its immediate field. In the architectural manual the type is selected by combining different assembler tags into a mask. This patch adds support for these tags. Reviewers: jyknight, venkatra, brad Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D53491 llvm-svn: 349048
* [X86][SSE] Fix modulo rotation amounts for v8i16/v16i16/v4i32 (PR38243)Simon Pilgrim2018-12-131-2/+5
| | | | llvm-svn: 349047
* [Sparc] Use float register for integer constrained with "f" in inline asmDaniel Cederman2018-12-131-8/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: Constraining an integer value to a floating point register using "f" causes an llvm_unreachable to trigger. This patch allows i32 integers to be placed in a single precision float register and i64 integers to be placed in a double precision float register. This matches the behavior of GCC. For other types the llvm_unreachable is removed to instead trigger an error message that points out the offending line. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D51614 llvm-svn: 349045
* [PowerPC][NFC] Sorting out Pseudo related classes to avoid confusionJinsong Ji2018-12-137-351/+345
| | | | | | | | | | | | | | | | | | | | | | | | | There are several Pseudo in PowerPC backend. eg: * ISel Pseudo-instructions , which has let usesCustomInserter=1 in td ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them. * Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.) ExpandPostRAPseudos -> expandPostRAPseudo will expand them * Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction * Pseudo instruction in CodeEmitter, which has encoding of 0. Currently, in td files, especially PPCInstrVSX.td, we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly. This patch is to * Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter * Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo * Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo Differential Revision: https://reviews.llvm.org/D55143 llvm-svn: 349044
* [X86][SSE] Merge the vXi16/vXi32 vector rotation expansion cases. NFCI.Simon Pilgrim2018-12-131-13/+3
| | | | | | Merged the repeated code into a single if(). llvm-svn: 349040
* [SystemZ] Pass copy-hinted regs first from getRegAllocationHints().Jonas Paulsson2018-12-131-3/+16
| | | | | | | | | When computing register allocation hints for a GRX32Bit register, make sure that any of the hinted registers that are also copy hints are returned first in the list. Review: Ulrich Weigand. llvm-svn: 349037
* [X86][BWI] Don't custom lower vXi8 rotations.Simon Pilgrim2018-12-131-18/+14
| | | | | | We always expand to shifts anyhow - test changes are just different scheduling only. llvm-svn: 349034
* [PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier.Chen Zheng2018-12-132-2/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D55499 llvm-svn: 349029
* [DAGCombine] Moved X86 rotate_amount % bitwidth == 0 early out to DAGCombinerSimon Pilgrim2018-12-131-8/+1
| | | | | | Remove common code from custom lowering (code is still safe if somehow a zero value gets used). llvm-svn: 349028
* [ARM GlobalISel] Support exts and truncs for Thumb2Diana Picus2018-12-132-15/+18
| | | | | | | | | | | Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for them in the instruction selector. This uses handwritten code again because the patterns that are generated with TableGen are tuned for what the DAG combiner would produce and not for simple sext/zext nodes. Luckily, we only need to update the opcodes to use the Thumb2 variants, everything else can be reused from ARM. llvm-svn: 349026
* [TargetLowering] Add ISD::ROTL/ROTR vector expansionSimon Pilgrim2018-12-131-8/+2
| | | | | | | | | | Move existing rotation expansion code into TargetLowering and set it up for vectors as well. Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment. Begun removing x86 vector rotate custom lowering to use the expansion. llvm-svn: 349025
* [RISCV] Add support for the various RISC-V FMA instruction variantsAlex Bradbury2018-12-133-3/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd). The criteria for choosing whether a fused add or subtract is used, as well as whether the product is negated or not, is whether some of the arguments to the llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd instructions were added to avoid the negation being performed using a xor trick, which prevented the proper FMA forms from being selected and thus tested. The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 - rs3), but they should be correct. The misleading names were inherited from MIPS, where the negation happens after computing the sum. The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions, as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd. Some comments in the test files about what type of instructions are there tested were updated, to better reflect the current content of those test files. Differential Revision: https://reviews.llvm.org/D54205 Patch by Luís Marques. llvm-svn: 349023
* [AArch64] Catch some more CMN opportunities.Arnaud A. de Grandmaison2018-12-131-0/+5
| | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=33486 llvm-svn: 349022
* AMDGPU/GlobalISel: Legalize f64 fadd/fmulMatt Arsenault2018-12-131-3/+3
| | | | llvm-svn: 349014
* AMDGPU/GlobalISel: RegBankSelect some simple operationsMatt Arsenault2018-12-132-2/+29
| | | | llvm-svn: 349012
* [X86] Remove assert leftover from when i1 was a legal type. Add more ↵Craig Topper2018-12-131-3/+1
| | | | | | accurate assert. NFC llvm-svn: 349007
* [AMDGPU] Fix build failure, second attemptStanislav Mekhanoshin2018-12-131-1/+1
| | | | | | | Some compilers complain that variable is captured and some complain when it is not. Switch to [&]. llvm-svn: 349006
* [AMDGPU] Fix build failureStanislav Mekhanoshin2018-12-131-1/+1
| | | | | | | Fixed error 'lambda capture 'CondReg' is not required to be captured for this use'. llvm-svn: 349005
* [AMDGPU] Simplify negated conditionStanislav Mekhanoshin2018-12-133-0/+187
| | | | | | | | | | | | | | | | | | | Optimize sequence: %sel = V_CNDMASK_B32_e64 0, 1, %cc %cmp = V_CMP_NE_U32 1, %1 $vcc = S_AND_B64 $exec, %cmp S_CBRANCH_VCC[N]Z => $vcc = S_ANDN2_B64 $exec, %cc S_CBRANCH_VCC[N]Z It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the rebuildSetCC(). Differential Revision: https://reviews.llvm.org/D55402 llvm-svn: 349003
OpenPOWER on IntegriCloud