summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [ARM] Add support for MVE vmaxv and vminvSam Tebbs2019-09-133-2/+35
| | | | | | | | This patch adds vecreduce_smax, vecredude_umax, vecreduce_smin, vecreduce_umin and selection for vmaxv and minv. Differential Revision: https://reviews.llvm.org/D66413 llvm-svn: 371827
* [BasicBlockUtils] Add optional BBName argument, in line with BB:splitBasicBlockFlorian Hahn2019-09-132-4/+6
| | | | | | | | | | Reviewers: spatel, asbirlea, craig.topper Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D67521 llvm-svn: 371819
* [AArch64] MachineCombiner FMA matching. NFC.Sjoerd Meijer2019-09-131-325/+112
| | | | | | | | | Follow-up of rL371321 that added some more FP16 FMA patterns, and an attempt to reduce the copy-pasting and make this more readable. Differential Revision: https://reviews.llvm.org/D67403 llvm-svn: 371818
* [TargetRegisterInfo] Remove SVT argument from getCommonSubClass.Craig Topper2019-09-132-14/+6
| | | | | | | | This was added to support fp128 on x86-64, but appears to be unneeded now. This may be because the FR128 register class added back then was merged with the VR128 register class later. llvm-svn: 371815
* AMDGPU/GlobalISel: Fix assert on multi-return side effect intrinsicsMatt Arsenault2019-09-131-1/+1
| | | | | | llvm.amdgcn.else hits this. llvm-svn: 371812
* AMDGPU/GlobalISel: Legalize s32->s16 G_SITOFP/G_UITOFPMatt Arsenault2019-09-131-1/+1
| | | | llvm-svn: 371811
* [RISCV] Support stack offset exceed 32-bit for RV64Shiva Chen2019-09-134-27/+45
| | | | | | Differential Revision: https://reviews.llvm.org/D61884 llvm-svn: 371810
* Revert "[RISCV] Support stack offset exceed 32-bit for RV64"Shiva Chen2019-09-134-44/+27
| | | | | | This reverts commit 1c340c62058d4115d21e5fa1ce3a0d094d28c792. llvm-svn: 371809
* AMDGPU/GlobalISel: Fix RegBankSelect for amdgcn.elseMatt Arsenault2019-09-131-0/+7
| | | | llvm-svn: 371808
* AMDGPU/GlobalISel: Select 16-bit VALU bit opsMatt Arsenault2019-09-131-3/+3
| | | | llvm-svn: 371807
* [RISCV] Support stack offset exceed 32-bit for RV64Shiva Chen2019-09-134-27/+44
| | | | | | Differential Revision: https://reviews.llvm.org/D61884 llvm-svn: 371806
* AMDGPU/GlobalISel: Legalize G_FFLOORMatt Arsenault2019-09-132-2/+3
| | | | llvm-svn: 371803
* Temporarily revert r371640 "LiveIntervals: Split live intervals on multiple ↵Tim Shen2019-09-131-11/+1
| | | | | | | | dead defs". It reveals a miscompile on Hexagon. See PR43302 for details. llvm-svn: 371802
* AMDGPU/GlobalISel: Legalize G_FMADMatt Arsenault2019-09-134-0/+51
| | | | | | | | | | | | | | | Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
* AMDGPU/GlobalISel: Select G_CTPOPMatt Arsenault2019-09-134-2/+15
| | | | llvm-svn: 371798
* DAG/GlobalISel: Correct type profile of bitcount opsMatt Arsenault2019-09-137-21/+21
| | | | | | | | The result integer does not need to be the same width as the input. AMDGPU, NVPTX, and Hexagon all have patterns working around the types matching. GlobalISel defines these as being different type indexes. llvm-svn: 371797
* LiveIntervals: Remove assertionMatt Arsenault2019-09-121-1/+2
| | | | | | | | | | | | | This testcase is invalid, and caught by the verifier. For the verifier to catch it, the live interval computation needs to complete. Remove the assert so the verifier catches this, which is less confusing. In this testcase there is an undefined use of a subregister, and lanes which aren't used or defined. An equivalent testcase with the super-register shrunk to have no untouched lanes already hit this verifier error. llvm-svn: 371792
* AMDGPU: Inline constant when materalizing FI with add on gfx9Matt Arsenault2019-09-122-3/+6
| | | | | | | | | This was relying on the SGPR usable for the carry out clobber to also be used for the input. There was no carry out on gfx9. With no carry out clobber to worry about, so the literal can just be directly used with a VOP2 add. llvm-svn: 371791
* Rename nonvolatile_load/store to simple_load/store [NFC]Philip Reames2019-09-126-31/+31
| | | | | | Implement the TODO from D66318. llvm-svn: 371789
* [AArch64][GlobalISel] Support tail calling with swiftself parametersJessica Paquette2019-09-121-5/+32
| | | | | | | | | | | | | | | | Swiftself uses a callee-saved register. We can tail call when the register used in the caller and callee is the same. This behaviour is equivalent to that in `TargetLowering::parametersInCSRMatch`. Update call-translator-tail-call.ll to verify that we can do this. When we support inline assembly, we can write a check similar to the one in the general swiftself.ll. For now, we need to verify that we get the correct COPY instruction after call lowering. Differential Revision: https://reviews.llvm.org/D67511 llvm-svn: 371788
* [SDAG] Update generic code to conservatively check for isAtomic in addition ↵Philip Reames2019-09-124-49/+80
| | | | | | | | | | to isVolatile This is the first sweep of generic code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. That will come later. See D66309 for context. Differential Revision: https://reviews.llvm.org/D66318 llvm-svn: 371786
* [AArch64][GlobalISel] Support sibling calls with outgoing argumentsJessica Paquette2019-09-123-25/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for lowering sibling calls with outgoing arguments. e.g ``` define void @foo(i32 %a) ``` Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`. The only thing that is missing is a full port of `TargetLowering::parametersInCSRMatch`. So, if we're using swiftself, we'll never tail call. - Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used for both outgoing and incoming arguments - Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for stack arguments. - Teach `lowerFormalArguments` to set the bytes in the caller's stack argument area. This is used later to check if the tail call's parameters will fit on the caller's stack. - Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on the callee's outgoing arguments. For testing: - Update call-translator-tail-call to verify that we can now tail call with outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the size of the caller's stack - Remove GISel-specific check lines from speculation-hardening.ll, since GISel now tail calls like the other selectors - Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that test now - Add a GISel test line to tailcall_misched_graph.ll since we tail call there now. Add specific check lines for GISel, since the debug output from the machine-scheduler differs with GlobalISel. The dependency still holds, but the output comes out in a different order. Differential Revision: https://reviews.llvm.org/D67471 llvm-svn: 371780
* [PowerPC] Remove the SPE4RC register class and instead add f32 to the GPRC ↵Craig Topper2019-09-128-47/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | register class. Summary: Since the SPE4RC register class contains an identical set of registers and an identical spill size to the GPRC class its slightly confusing the tablegen emitter. It's preventing the GPRC_and_GPRC_NOR0 synthesized register class from inheriting VTs and AltOrders from GPRC or GPRC_NOR0. This is because SPE4C is found first in the super register class list when inheriting these properties and it doesn't set the VTs or AltOrders the same way as GPRC or GPRC_NOR0. This patch replaces all uses of GPE4RC with GPRC and allows GPRC and GPRC_NOR0 to contain f32. The test changes here are because the AltOrders are being inherited to GPRC_NOR0 now. Found while trying to determine if getCommonSubClass needs to take a VT argument. It was originally added to support fp128 on x86-64, I've changed some things about that so that it might be needed anymore. But a PowerPC test crashed without it and I think its due to this subclass issue. Reviewers: jhibbits, nemanjai, kbarton, hfinkel Subscribers: wuzish, nemanjai, mehdi_amini, hiraditya, kbarton, MaskRay, dexonsmith, jsji, shchenz, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67513 llvm-svn: 371779
* [SCEV] Add smin support to getRangeRefPhilip Reames2019-09-121-0/+8
| | | | | | | | We were failing to compute trip counts (both exact and maximum) for any loop which involved a comparison against either an umin or smin. It looks like this simply got missed when we added smin/umin to SCEV. (Note: umin was submitted separately earlier today. Turned out two folks hit this at the same time.) Differential Revision: https://reviews.llvm.org/D67514 llvm-svn: 371776
* [DAGCombiner][X86] Pass the CmpOpVT to reduceSelectOfFPConstantLoads so X86 ↵Craig Topper2019-09-123-3/+4
| | | | | | | | | | | can exclude fp128 compares. The X86 decision assumes the compare will produce a result in an XMM register, but that can't happen for an fp128 compare since those go to a libcall the returns an i32. Pass the VT so X86 can check the type. llvm-svn: 371775
* [ConstantFolding] Expand folding of some library functionsEvandro Menezes2019-09-121-3/+22
| | | | | | | | | Expanding the folding of `nearbyint()`, `rint()` and `trunc()` to library functions, in addition to the current support for intrinsics. Differential revision: https://reviews.llvm.org/D67468 llvm-svn: 371774
* [SelectionDAGBuilder] Simplify loop in visitSelect back to how it was before ↵Craig Topper2019-09-121-2/+1
| | | | | | | | | | | | | r255558. This code was changed to accomodate fp128 being softened to itself during type legalization on x86-64. This was done in order to create libcalls while having fp128 as a legal type. We're now doing the libcall creation during LegalizeDAG and the type legalization changes to enable the old behavior have been removed. So this change to SelectionDAGBuilder is no longer needed. llvm-svn: 371771
* [X86] Move negateFMAOpcode helper earlier to help future patch. NFCI.Simon Pilgrim2019-09-121-32/+32
| | | | llvm-svn: 371770
* [SCEV] Support SCEVUMinExpr in getRangeRef.Florian Hahn2019-09-121-0/+8
| | | | | | | | | | | | | This patch adds support for SCEVUMinExpr to getRangeRef, similar to the support for SCEVUMaxExpr. Reviewers: sanjoy.google, efriedma, reames, nikic Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D67177 llvm-svn: 371768
* AMDGPU: Fix bug in r371671 on some builds.Austin Kerbow2019-09-121-2/+5
| | | | llvm-svn: 371761
* [LICM/AST] Check if the AliasAny set is removed from the tracker.Alina Sbirlea2019-09-121-2/+10
| | | | | | | | | | | | | | | | Summary: Resolves PR38513. Credit to @bjope for debugging this. Reviewers: hfinkel, uabelho, bjope Subscribers: sanjoy.google, bjope, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67417 llvm-svn: 371752
* [MemorySSA] Pass (for update) MSSAU when hoisting instructions.Alina Sbirlea2019-09-121-19/+28
| | | | | | | | | | | | | | Summary: Pass MSSAU to makeLoopInvariant in order to properly update MSSA. Reviewers: george.burgess.iv Subscribers: Prazek, sanjoy.google, uabelho, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67470 llvm-svn: 371748
* [LV] Support invariant addresses in speculation logicPhilip Reames2019-09-121-10/+18
| | | | | | | | | | Implement a TODO from rL371452, and handle loop invariant addresses in predicated blocks. If we can prove that the load is safe to speculate into the header, then we can avoid using a masked.load in favour of a normal load. This is mostly about vectorization robustness. In the common case, it's generally expected that LICM/LoadStorePromotion would have eliminated such loads entirely. Differential Revision: https://reviews.llvm.org/D67372 llvm-svn: 371745
* [CGP] Ensure sinking multiple instructions does not invalidate dominance checksDavid Green2019-09-121-9/+23
| | | | | | | | | | | | | | | | | | In MVE, as of rL371218, we are attempting to sink chains of instructions such as: %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0 %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer In certain situations though, we can end up breaking the dominance relations of instructions. This happens when we sink the instruction into a loop, but cannot remove the originals. The Use is updated, which might in fact be a Use from the second instruction to the first. This attempts to fix that by reversing the order of instruction that are sunk, and ensuring that we update the uses on new instructions if they have already been sunk, not the old ones. Differential Revision: https://reviews.llvm.org/D67366 llvm-svn: 371743
* [Alignment] Move OffsetToAlignment to Alignment.hGuillaume Chatelet2019-09-1213-41/+54
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67499 llvm-svn: 371742
* [ConstProp] allow folding for fma that produces NaNSanjay Patel2019-09-121-7/+3
| | | | | | | | | | | | | | | | | | | | Folding for fma/fmuladd was added here: rL202914 ...and as seen in existing/unchanged tests, that works to propagate NaN if it's already an input, but we should fold an fma() that creates NaN too. From IEEE-754-2008 7.2 "Invalid Operation", there are 2 clauses that apply to fma, so I added tests for those patterns: c) fusedMultiplyAdd: fusedMultiplyAdd(0, ∞, c) or fusedMultiplyAdd(∞, 0, c) unless c is a quiet NaN; if c is a quiet NaN then it is implementation defined whether the invalid operation exception is signaled d) addition or subtraction or fusedMultiplyAdd: magnitude subtraction of infinities, such as: addition(+∞, −∞) Differential Revision: https://reviews.llvm.org/D67446 llvm-svn: 371735
* [MIPS GlobalISel] Select indirect branchPetar Avramovic2019-09-123-0/+9
| | | | | | | | Select G_BRINDIRECT for MIPS32. Differential Revision: https://reviews.llvm.org/D67441 llvm-svn: 371730
* [MIPS GlobalISel] Lower G_DYN_STACKALLOCPetar Avramovic2019-09-121-0/+3
| | | | | | | | | | IRTranslator creates G_DYN_STACKALLOC instruction during expansion of alloca when argument that tells number of elements to allocate on stack is a virtual register. Use default lowering for MIPS32. Differential Revision: https://reviews.llvm.org/D67440 llvm-svn: 371728
* [MIPS GlobalISel] Select G_IMPLICIT_DEFPetar Avramovic2019-09-123-1/+39
| | | | | | | | | | G_IMPLICIT_DEF is used for both integer and floating point implicit-def. Handle G_IMPLICIT_DEF as ambiguous opcode in MipsRegisterBankInfo. Select G_IMPLICIT_DEF for MIPS32. Differential Revision: https://reviews.llvm.org/D67439 llvm-svn: 371727
* [DAGCombine] visitFDIV - Use isCheaperToUseNegatedFPOps helper for (fdiv ↵Simon Pilgrim2019-09-121-15/+5
| | | | | | | | (fneg X), (fneg Y)) -> (fdiv X, Y). NFCI. Minor cleanup to use equivalent helper code. llvm-svn: 371724
* AArch64: support arm64_32, an ILP32 slice for watchOS.Tim Northover2019-09-1229-82/+379
| | | | | | | | This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722
* CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI.Tim Northover2019-09-121-2/+2
| | | | | | | | | Up to now, we've decided whether to sink address calculations using GEPs or normal arithmetic based on the useAA hook, but there are other reasons GEPs might be preferred. So this patch splits the two questions, with a default implementation falling back to useAA. llvm-svn: 371721
* [InstSimplify] simplifyUnsignedRangeCheck(): handle more cases (PR43251)Roman Lebedev2019-09-121-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I don't have a direct motivational case for this, but it would be good to have this for completeness/symmetry. This pattern is basically the motivational pattern from https://bugs.llvm.org/show_bug.cgi?id=43251 but with different predicate that requires that the offset is non-zero. The completeness bit comes from the fact that a similar pattern (offset != zero) will be needed for https://bugs.llvm.org/show_bug.cgi?id=43259, so it'd seem to be good to not overlook very similar patterns.. Proofs: https://rise4fun.com/Alive/21b Also, there is something odd with `isKnownNonZero()`, if the non-zero knowledge was specified as an assumption, it didn't pick it up (PR43267) Reviewers: spatel, nikic, xbolva00 Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67411 llvm-svn: 371718
* [DAGCombiner] Improve division estimation of floating points.Qiu Chaofan2019-09-121-11/+33
| | | | | | | | | | | | | Current implementation of estimating divisions loses precision since it estimates reciprocal first and does multiplication. This patch is to re-order arithmetic operations in the last iteration in DAGCombiner to improve the accuracy. Reviewed By: Sanjay Patel, Jinsong Ji Differential Revision: https://reviews.llvm.org/D66050 llvm-svn: 371713
* [LegalizeTypes] Remove code for softening a float type to itself.Craig Topper2019-09-125-266/+58
| | | | | | | | | This was previously used to turn fp128 operations into libcalls on X86. This is now done through op legalization after r371672. This restores much of this code to before r254653. llvm-svn: 371709
* Make SwitchInstProfUpdateWrapper strict permanentlyYevgeny Rouban2019-09-121-29/+12
| | | | | | | | | | | | | | | | | | | We have been using -switch-inst-prof-update-wrapper-strict set to true by default for some time. It is time to remove the safety stuff and make SwitchInstProfUpdateWrapper intolerant to inconsistencies in !prof branch_weights metadata of SwitchInst. This patch gets rid of the Invalid state of SwitchInstProfUpdateWrapper and the option -switch-inst-prof-update-wrapper-strict. So there is only two states: changed and unchanged. Reviewers: davidx, nikic, eraman, reames, chandlerc Reviewed By: davidx Differential Revision: https://reviews.llvm.org/D67435 llvm-svn: 371707
* [X86] Enable -mprefer-vector-width=256 by default for Skylake-avx512 and ↵Craig Topper2019-09-111-0/+2
| | | | | | | | | | | | | | | | later Intel CPUs. AVX512 instructions can cause a frequency drop on these CPUs. This can negate the performance gains from using wider vectors. Enabling prefer-vector-width=256 will prevent generation of zmm registers unless explicit 512 bit operations are used in the original source code. I believe gcc and icc both do something similar to this by default. Differential Revision: https://reviews.llvm.org/D67259 llvm-svn: 371694
* [AArch64][GlobalISel] Fall back on attempts to allocate split types on the ↵Amara Emerson2019-09-111-6/+14
| | | | | | | | | | | | | | | stack. First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually make a difference for us in CallLowering but fix that anyway to silence the assert. The bigger issue was that after fixing the assert we were generating invalid MIR because the merging/unmerging of values split across multiple registers wasn't also implemented for memory locs. This happens when we run out of registers and have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but for now just fall back. llvm-svn: 371693
* [GlobalISel][AArch64] Check caller for swifterror params in tailcall eligibilityJessica Paquette2019-09-111-3/+7
| | | | | | | | | | | | | | | | Before, we only checked the callee for swifterror. However, we should also be checking the caller to see if it has a swifterror parameter. Since we don't currently handle outgoing arguments, this didn't show up in the swifterror.ll testcase. Also, remove the swifterror checks from call-translator-tail-call.ll, since they are covered by the existing swifterror testing. Better to have it all in one place. Differential Revision: https://reviews.llvm.org/D67465 llvm-svn: 371692
* [TableGen] Skip CRLF conversion when writing outputReid Kleckner2019-09-111-2/+2
| | | | | | | | | Doing the CRLF translation while writing the file defeats our optimization to not update the file if it hasn't changed. Fixes PR43271. llvm-svn: 371683
OpenPOWER on IntegriCloud