summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [GlobalISel][AArch64] Add support for @llvm.ceilJessica Paquette2018-12-192-0/+9
| | | | | | | | | | | | This adds a G_FCEIL generic instruction and uses it in AArch64. This adds selection for floating point ceil where it has a supported, dedicated instruction. Other cases aren't handled here. It updates the relevant gisel tests and adds a select-ceil test. It also adds a check to arm64-vcvt.ll which ensures that we don't fall back when we run into one of the relevant cases. llvm-svn: 349664
* [X86] Don't match TESTrr from (cmp (and X, Y), 0) during isel. Defer to post ↵Craig Topper2018-12-192-6/+30
| | | | | | | | | | | | processing The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI. This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect. Differential Revision: https://reviews.llvm.org/D55870 llvm-svn: 349661
* [X86] Fix assert fails in pass X86AvoidSFBPassCraig Topper2018-12-191-13/+14
| | | | | | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=38743 The function removeRedundantBlockingStores is supposed to remove any blocking stores contained in each other in lockingStoresDispSizeMap. But it currently looks only at the previous one, which will miss some cases that result in assert. This patch refine the function to check all previous layouts until find the uncontained one. So all redundant stores will be removed. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D55642 llvm-svn: 349660
* [AArch64] Improve the Exynos M3 pipeline modelEvandro Menezes2018-12-191-4/+4
| | | | llvm-svn: 349652
* Test commitAnton Afanasyev2018-12-191-4/+4
| | | | | | Fix typos. llvm-svn: 349644
* [ValueTracking] remove unused parameters from helper functions; NFCSanjay Patel2018-12-191-23/+16
| | | | llvm-svn: 349641
* [BPF] Generate BTF DebugInfo under BPF targetYonghong Song2018-12-196-0/+1301
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements BTF (BPF Type Format). The BTF is the debug info format for BPF, introduced in the below linux patch: https://github.com/torvalds/linux/commit/69b693f0aefa0ed521e8bd02260523b5ae446ad7#diff-06fb1c8825f653d7e539058b72c83332 and further extended several times, e.g., https://www.spinics.net/lists/netdev/msg534640.html https://www.spinics.net/lists/netdev/msg538464.html https://www.spinics.net/lists/netdev/msg540246.html The main advantage of implementing in LLVM is: . better integration/deployment as no extra tools are needed. . bpf JIT based compilation (like bcc, bpftrace, etc.) can get BTF without much extra effort. . BTF line_info needs selective source codes, which can be easily retrieved when inside the compiler. This patch implemented BTF generation by registering a BPF specific DebugHandler in BPFAsmPrinter. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55752 llvm-svn: 349640
* [Object] Deduplicate long archive member namesPeter Wu2018-12-191-4/+22
| | | | | | | | | | | | | Summary: Import libraries as created by llvm-dlltool always use the same archive member name for every object file (namely, the DLL library name). Ensure that long names are not repeatedly stored in the string table. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D55860 llvm-svn: 349637
* [X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT ↵Simon Pilgrim2018-12-191-17/+4
| | | | | | | | | | | | | | generic intrinsics (llvm) Now that we use the generic ISD opcodes, we can use the generic intrinsics directly as well. This fixes the poor fast-isel codegen by not expanding to an easily broken IR code sequence. I'm intending to deal with the signed saturation equivalents as well. Clang counterpart: https://reviews.llvm.org/D55879 Differential Revision: https://reviews.llvm.org/D55855 llvm-svn: 349630
* [SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate ↵Simon Pilgrim2018-12-191-4/+4
| | | | | | | | | | | | | | (part 2 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1|c2) fold to demonstrate its use, which I believe is safe for undef cases. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349629
* [SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate ↵Simon Pilgrim2018-12-191-6/+13
| | | | | | | | | | | | (part 1 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349628
* [TargetLowering] Fix propagation of undefs in zero extension ops (PR40091)Simon Pilgrim2018-12-192-4/+23
| | | | | | | | | | | | As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625
* Let TableGen write output only if it changed, instead of doing so in cmake, ↵Nico Weber2018-12-191-8/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | attempt 2 This relands r330742: """ Let TableGen write output only if it changed, instead of doing so in cmake. Removes one subprocess and one temp file from the build for each tablegen invocation. No intended behavior change. """ In particular, if you see rebuilds after this change that you didn't see before this change, that's unintended and it's fine to revert this change again (but let me know). r330742 got reverted because some people reported that llvm-tblgen ran on every build after it. This could happen if the depfile output got deleted without deleting the main .inc output. To fix, make TableGen always write the depfile, but keep writing the main .inc output only if it has changed. This matches what we did in cmake before. Differential Revision: https://reviews.llvm.org/D55842 llvm-svn: 349624
* AMDGPU: Use an ABS32_LO relocation for SCRATCH_RSRC_DWORD1Nicolai Haehnle2018-12-191-4/+2
| | | | | | | | | | | | | | | | | | | Summary: Using HI here makes no logical sense, since the dword is only 32 bits to begin with. Current Mesa master does not look at the relocation type at all, so this change is fine. Future Mesa will rely on this, however. Change-Id: I91085707834c4ac0370926602b93c94b90e44cb1 Reviewers: arsenm, rampitec, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55369 llvm-svn: 349620
* [SelectionDAG] Optional handling of UNDEF elements in matchUnaryPredicateSimon Pilgrim2018-12-191-4/+13
| | | | | | | | | | | | Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs. This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument. I've updated SelectionDAG::simplifyShift to demonstrate its use. Differential Revision: https://reviews.llvm.org/D55819 llvm-svn: 349616
* AMDGPU/InsertWaitcnts: Update VGPR/SGPR bounds when brackets are mergedCarl Ritson2018-12-191-0/+3
| | | | | | | | | | | | | | | | | | Summary: Fix an issue where VGPR/SGPR bounds are not properly extended when brackets are merged. This manifests as missing waitcnt insertions when multiple brackets are forwarded to a successor block and the first forward has lower VGPR/SGPR bounds. Irreducible loop test has been extended based on a CTS failure detected for GFX9. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D55602 llvm-svn: 349611
* [ARM GlobalISel] Support G_CONSTANT for Thumb2Diana Picus2018-12-191-4/+4
| | | | | | | | | | | All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610
* AMDGPU/GlobalISel: Regbankselect for fsubMatt Arsenault2018-12-191-0/+1
| | | | llvm-svn: 349608
* [llvm-objcopy] Initial COFF supportMartin Storsjo2018-12-191-0/+12
| | | | | | | | | This is an initial implementation of no-op passthrough copying of COFF with objcopy. Differential Revision: https://reviews.llvm.org/D54939 llvm-svn: 349605
* [PowerPC]Exploit P9 vabsdu for unsigned vselect patternsKewen Lin2018-12-192-0/+66
| | | | | | | | | | | | For type v4i32/v8ii16/v16i8, do following transforms: (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b) (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b) Differential Revision: https://reviews.llvm.org/D55812 llvm-svn: 349599
* [AArch64] Simplify the Exynos M3 pipeline modelEvandro Menezes2018-12-181-12/+9
| | | | llvm-svn: 349569
* [AArch64] Fix instructions order (NFC)Evandro Menezes2018-12-181-4/+4
| | | | llvm-svn: 349568
* [DebugInfo] Move several private headers to include directoryYonghong Song2018-12-1811-309/+10
| | | | | | | | | | | | | | | | | This patch moved the following files in lib/CodeGen/AsmPrinter/ AsmPrinterHandler.h DbgEntityHistoryCalculator.h DebugHandlerBase.h to include/llvm/CodeGen directory. Such a change will enable Target to extend DebugHandlerBase and emit Target specific debug info sections. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55755 llvm-svn: 349564
* Preserve the linkage for objc* intrinsics as clang will set them to ↵Pete Cooper2018-12-181-5/+8
| | | | | | | | | | weak_external in some cases Clang uses weak linkage for objc runtime functions when they are not available on the platform. The intrinsic has this linkage so we just need to pass that on to the runtime call. llvm-svn: 349559
* Add nonlazybind to objc_retain/objc_release when converting from intrinsics.Pete Cooper2018-12-181-3/+10
| | | | | | | | For performance reasons, clang set nonlazybind on these functions. Now that we are using intrinsics instead of runtime calls, we should set this attribute when creating the runtime functions. llvm-svn: 349558
* [LAA] Introduce enum for vectorization safety status (NFC).Florian Hahn2018-12-181-6/+12
| | | | | | | | | | | | | | This patch adds a VectorizationSafetyStatus enum, which will be extended in a follow up patch to distinguish between 'safe with runtime checks' and 'known unsafe' dependences. Reviewers: anemet, anna, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D54892 llvm-svn: 349556
* [asan] Restore ODR-violation detection on vtablesVitaly Buka2018-12-181-2/+2
| | | | | | | | | | | | | | | | | | | Summary: unnamed_addr is still useful for detecting of ODR violations on vtables Still unnamed_addr with lld and --icf=safe or --icf=all can trigger false reports which can be avoided with --icf=none or by using private aliases with -fsanitize-address-use-odr-indicator Reviewers: eugenis Reviewed By: eugenis Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55799 llvm-svn: 349555
* Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering ↵Pete Cooper2018-12-182-52/+112
| | | | | | | | | | instead of SDAG. SelectionDAG currently changes these intrinsics to function calls, but that won't work for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming from the front-end which we can't do in SelectionDAG. llvm-svn: 349552
* [AArch64] Avoid crashing on .seh directives in assemblyMartin Storsjo2018-12-181-4/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D55670 llvm-svn: 349549
* [asan] In llvm.asan.globals, allow entries to be non-GlobalVariable and skip ↵Kuba Mracek2018-12-181-1/+4
| | | | | | | | | | over them Looks like there are valid reasons why we need to allow bitcasts in llvm.asan.globals, see discussion at https://github.com/apple/swift-llvm/pull/133. Let's look through bitcasts when iterating over entries in the llvm.asan.globals list. Differential Revision: https://reviews.llvm.org/D55794 llvm-svn: 349544
* [llvm-mca] Dump mask in hexEvandro Menezes2018-12-181-2/+4
| | | | | | Dump the resources masks as hexadecimal. llvm-svn: 349536
* Change the objc ARC optimizer to use the new objc.* intrinsicsPete Cooper2018-12-183-147/+81
| | | | | | | | | | | We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534
* [X86] Add BSR to isUseDefConvertible.Craig Topper2018-12-181-6/+6
| | | | | | | | We already had BSF here as part of __builtin_ffs improvements and I was just wondering yesterday whether we should have BSR there. This addresses one issue from PR40090. llvm-svn: 349531
* [InstCombine] Simplify cttz/ctlz + icmp eq/ne into mask checkNikita Popov2018-12-181-3/+22
| | | | | | | | | | | | | Checking whether a number has a certain number of trailing / leading zeros means checking whether it is of the form XXXX1000 / 0001XXXX, which can be done with an and+icmp. Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next step, this can be extended to non-equality predicates. Differential Revision: https://reviews.llvm.org/D55745 llvm-svn: 349530
* [AMDGPU] Removed the unnecessary operand size-check-assert from ↵Farhana Aleen2018-12-181-2/+0
| | | | | | | | | | processBaseWithConstOffset(). Summary: 32bit operand sizes are guaranteed by the opcode check AMDGPU::V_ADD_I32_e64 and AMDGPU::V_ADDC_U32_e64. Therefore, we don't any additional operand size-check-assert. Author: FarhanaAleen llvm-svn: 349529
* DebugInfo: Fix missing local imported entities after r349207David Blaikie2018-12-181-3/+3
| | | | | | Post commit review/bug reported by Pavel Labath - thanks! llvm-svn: 349528
* [SCCP] Get rid of redundant call for getPredicateInfoFor (NFC).Florian Hahn2018-12-181-1/+1
| | | | | | We can use the result fetched a few lines above. llvm-svn: 349527
* [X86] Don't use SplitOpsAndApply to create ISD::UADDSAT/ISD::USUBSAT nodes. ↵Craig Topper2018-12-181-30/+8
| | | | | | | | Let type legalization and op legalization deal with it. Now that we've switched to target independent nodes we can rely on generic infrastructure to do the legalization for us. llvm-svn: 349526
* [InstCombine] refactor isCheapToScalarize(); NFCSanjay Patel2018-12-181-33/+25
| | | | | | | | | As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523
* [X86] Use SADDSAT/SSUBSAT instead of ADDS/SUBSNikita Popov2018-12-186-34/+48
| | | | | | | | | | | | Migrate the X86 backend from X86ISD opcodes ADDS and SUBS to generic ISD opcodes SADDSAT and SSUBSAT. This also improves scodegen for @llvm.sadd.sat() and @llvm.ssub.sat() intrinsics. This is a followup to D55787 and part of PR40056. Differential Revision: https://reviews.llvm.org/D55833 llvm-svn: 349520
* [X86] Create PSUBUS from (add (umax X, C), -C)Craig Topper2018-12-181-0/+44
| | | | | | | | | | | | InstCombine seems to canonicalize or PSUB patter into a max with the cosntant and an add with an inverse of the constant. This patch recognizes this pattern and turns it into PSUBUS. Future work could improve undef element handling. Fixes some of PR40053 Differential Revision: https://reviews.llvm.org/D55780 llvm-svn: 349519
* Buildfix for r345516 (Clang compilation failing).Alexandre Ganea2018-12-181-1/+1
| | | | llvm-svn: 349518
* [llvm-symbolizer] Omit stderr output when symbolizing a crashAlexandre Ganea2018-12-181-3/+11
| | | | | | Differential revision: https://reviews.llvm.org/D55723 llvm-svn: 349516
* Add FMF management to common fp intrinsics in GlobalIselMichael Berg2018-12-181-22/+50
| | | | | | | | | | | | | | Summary: This the initial code change to facilitate managing FMF flags from Instructions to MI wrt Intrinsics in Global Isel. Eventually the GlobalObserver interface will be added as well, where FMF additions can be tracked for the builder and CSE. Reviewers: aditya_nandakumar, bogner Reviewed By: bogner Subscribers: rovka, kristof.beyls, javed.absar Differential Revision: https://reviews.llvm.org/D55668 llvm-svn: 349514
* [LoopVectorize] Rename pass options. NFC.Michael Kruse2018-12-183-16/+20
| | | | | | | | | | | | | | | | | | | | | | Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513
* [X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for constant ↵Simon Pilgrim2018-12-181-0/+3
| | | | | | | | rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349510
* [LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.Michael Kruse2018-12-182-33/+51
| | | | | | | | | | | | | | | | | | | | | | | | When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509
* [X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for splat rotation ↵Simon Pilgrim2018-12-181-3/+4
| | | | | | | | amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349500
* [MIPS GlobalISel] Select G_SDIV, G_UDIV, G_SREM and G_UREMPetar Avramovic2018-12-184-9/+46
| | | | | | | | | | | | Add support for s64 libcalls for G_SDIV, G_UDIV, G_SREM and G_UREM and use integer type of correct size when creating arguments for CLI.lowerCall. Select G_SDIV, G_UDIV, G_SREM and G_UREM for types s8, s16, s32 and s64 on MIPS32. Differential Revision: https://reviews.llvm.org/D55651 llvm-svn: 349499
* [X86] Use UADDSAT/USUBSAT instead of ADDUS/SUBUSNikita Popov2018-12-185-38/+75
| | | | | | | | | | | | | Replace the X86ISD opcodes ADDUS and SUBUS with generic ISD opcodes UADDSAT and USUBSAT. As a side-effect, this also makes codegen for the @llvm.uadd.sat and @llvm.usub.sat intrinsics reasonable. This only replaces use in the X86 backend, and does not move any of the ADDUS/SUBUS X86 specific combines into generic codegen. Differential Revision: https://reviews.llvm.org/D55787 llvm-svn: 349481
OpenPOWER on IntegriCloud