summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] generalize binop-of-splats scalarizationSanjay Patel2019-04-231-46/+38
| | | | | | | | | | | | | | If we only match build vectors, we can miss some patterns that use shuffles as seen in the affected tests. Note that the underlying calls within getSplatSourceVector() have the potential for compile-time explosion because of exponential recursion looking through binop opcodes, but currently the list of supported opcodes is very limited. Both of those problems should be addressed in follow-up patches. llvm-svn: 358984
* AMDGPU: Fix LCSSA phi lowering in SILowerI1CopiesNicolai Haehnle2019-04-231-1/+8
| | | | | | | | | | | | | | | | | | | | | | Summary: When an LCSSA phi survives through instruction selection, the pass ends up removing that phi entirely because it is dominated by the logic that does the lanemask merging. This then used to trigger an assertion when processing a dependent phi instruction. Change-Id: Id4949719f8298062fe476a25718acccc109113b6 Reviewers: llvm-commits Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, tpr, dstuttard, rtaylor, arsenm Tags: #llvm Differential Revision: https://reviews.llvm.org/D60999 llvm-svn: 358983
* [CallSite removal] move InlineCost to CallBase usageFedor Sergeev2019-04-236-114/+113
| | | | | | | | | | | Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982
* [ARM] Update check for CBZ in IfcvtDavid Green2019-04-233-43/+59
| | | | | | | | | | | The check for creating CBZ in constant island pass recently obtained the ability to search backwards to find a Cmp instruction. The code in IfCvt should mirror this to allow more conversions to the smaller form. The common code has been pulled out into a separate function to be shared between the two places. Differential Revision: https://reviews.llvm.org/D60090 llvm-svn: 358977
* [ARM] Don't replicate instructions in Ifcvt at minsizeDavid Green2019-04-231-0/+9
| | | | | | | | | | Ifcvt can replicate instructions as it converts them to be predicated. This stops that from happening on thumb2 targets at minsize where an extra IT instruction is likely needed. Differential Revision: https://reviews.llvm.org/D60089 llvm-svn: 358974
* Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-04-231-2/+2
| | | | llvm-svn: 358970
* Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI.Simon Pilgrim2019-04-231-2/+2
| | | | llvm-svn: 358969
* [DAGCombiner] Combine OR as ADD when no common bits are setBjorn Pettersson2019-04-231-16/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The DAGCombiner is rewriting (canonicalizing) an ISD::ADD with no common bits set in the operands as an ISD::OR node. This could sometimes result in "missing out" on some combines that normally are performed for ADD. To be more specific this could happen if we already have rewritten an ADD into OR, and later (after legalizations or combines) we expose patterns that could have been optimized if we had seen the OR as an ADD (e.g. reassociations based on ADD). To make the DAG combiner less sensitive to if ADD or OR is used for these "no common bits set" ADD/OR operations we now apply most of the ADD combines also to an OR operation, when value tracking indicates that the operands have no common bits set. Reviewers: spatel, RKSimon, craig.topper, kparzysz Reviewed By: spatel Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59758 llvm-svn: 358965
* [AArch64] Add support for MTE intrinsicsJaved Absar2019-04-234-22/+78
| | | | | | | | | | | This patch provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. The intrinsics are described in detail in the latest ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed by: David Spickett Differential Revision: https://reviews.llvm.org/D60486 llvm-svn: 358963
* [ARM][FIX] Add missing f16.lane.vldN/vstN loweringDiogo N. Sampaio2019-04-231-0/+2
| | | | | | | | | | | | | | | | | | Summary: Add missing D and Q lane VLDSTLane lowering for fp16 elements. Reviewers: efriedma, kosarev, SjoerdMeijer, ostannard Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60874 llvm-svn: 358962
* [llvm-mc] - Properly set the the address align field of the compressed sections.George Rimar2019-04-231-2/+6
| | | | | | | | | | | | | | | | | | | | | About the compressed sections spec says: (https://docs.oracle.com/cd/E37838_01/html/E36783/section_compression.html) sh_addralign fields of the section header for a compressed section reflect the requirements of the compressed section. Currently, llvm-mc always puts uncompressed section alignment to sh_addralign. It is not correct. zlib styled section contains an Elfxx_Chdr header, so we should either use 4 or 8 values depending on the target (Uncompressed section alignment is stored in ch_addralign field of the compression header). GNU assembler version 2.31.1 also has this issue, but in 2.32.51 it was already fixed. This is how it was found during debugging of the https://bugs.llvm.org/show_bug.cgi?id=40482 actually. Differential revision: https://reviews.llvm.org/D60965 llvm-svn: 358960
* [LSR] Limit the recursion for setup costDavid Green2019-04-231-11/+14
| | | | | | | | | | | | | | In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958
* [WebAssembly] Bail out of fastisel earlier when computing PIC addressesSam Clegg2019-04-231-11/+6
| | | | | | | | | | | This change partially reverts https://reviews.llvm.org/D54647 in favor of bailing out during computeAddress instead. This catches the condition earlier and handles more cases. Differential Revision: https://reviews.llvm.org/D60986 llvm-svn: 358948
* Revert "Use const DebugLoc&"Chandler Carruth2019-04-231-2/+2
| | | | | | | | | | | | | | | | This reverts r358910 (git commit 2b744665308fc8d30a3baecb4947f2bd81aa7d30) While this patch *seems* trivial and safe and correct, it is not. The copies are actually load bearing copies. You can observe this with MSan or other ways of checking for use-after-destroy, but otherwise this may result in ... difficult to debug inexplicable behavior. I suspect the issue is that the debug location is used after the original reference to it is removed. The metadata backing it gets destroyed as its last references goes away, and then we reference it later through these const references. llvm-svn: 358940
* DebugInfo: Emit only one kind of accelerated access/name tableDavid Blaikie2019-04-223-3/+8
| | | | | | | | | | | | | | | Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 358931
* [SelectionDAG] move splat util functions up from x86 loweringSanjay Patel2019-04-222-56/+53
| | | | | | | | | | This was supposed to be NFC, but the change in SDLoc definitions causes instruction scheduling changes. There's nothing x86-specific in this code, and it can likely be used from DAGCombiner's simplifyVBinOp(). llvm-svn: 358930
* [AMDGPU] Fix an issue in `op_sel_hi` skipping.Michael Liao2019-04-221-7/+16
| | | | | | | | | | | | | | | | | Summary: - Only apply packed literal `op_sel_hi` skipping on operands requiring packed literals. Even an instruction is `packed`, it may have operand requiring non-packed literal, such as `v_dot2_f32_f16`. Reviewers: rampitec, arsenm, kzhuravl Subscribers: jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60978 llvm-svn: 358922
* [InstCombine] Eliminate stores to constant memoryPhilip Reames2019-04-222-0/+24
| | | | | | | | | | | | If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919
* [InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify ↵Philip Reames2019-04-222-6/+2
| | | | | | | | from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913
* Use const DebugLoc&Matt Arsenault2019-04-221-2/+2
| | | | llvm-svn: 358910
* AMDGPU: Skip debug instructions in assertMatt Arsenault2019-04-221-2/+7
| | | | | | | | | | These are inserted after branch relaxation, and for some reason it's decided to put them in the long branch expansion block. It's probably not great to rely on the source block address, so this should probably be switched to being PC relative instead of relying on the block address llvm-svn: 358909
* [IPSCCP] Add missing `AssumptionCacheTracker` dependencyJustin Bogner2019-04-221-0/+1
| | | | | | | | | Back in August, r340525 introduced a dependency on the assumption cache tracker in the ipsccp pass, but that commit missed a call to INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache improperly registered if SCCP is the only thing that pulls it in. llvm-svn: 358903
* [LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, ↵Philip Reames2019-04-222-0/+13
| | | | | | | | | | | | | | LoopSimplify) Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways. In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available. So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers. This patch avoids an invalidation between the two requires in the following trivial pass pipeline: opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>" (when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops) Differential Revision: https://reviews.llvm.org/D60790 llvm-svn: 358901
* [PGO/SamplePGO][NFC] Move the function updateProfWeight from InstructionWei Mi2019-04-222-43/+44
| | | | | | | | | | | | | | | | | | | | | | to CallInst. The issue was raised here: https://reviews.llvm.org/D60903#1472783 The function Instruction::updateProfWeight is only used for CallInst in profile update. From the current interface, it is very easy to think that the function can also be used for branch instruction. However, Branch instruction does't need the scaling the function provides for branch_weights and VP (value profile), in addition, scaling may introduce inaccuracy for branch probablity. The patch moves the function updateProfWeight from Instruction class to CallInst to remove the confusion. The patch also changes the scaling of branch_weights from a loop to a block because we know that ProfileData for branch_weights of CallInst will only have two operands at most. Differential Revision: https://reviews.llvm.org/D60911 llvm-svn: 358900
* AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sourcesMatt Arsenault2019-04-221-1/+3
| | | | llvm-svn: 358894
* GlobalISel: Legalize scalar G_EXTRACT sourcesMatt Arsenault2019-04-221-0/+7
| | | | llvm-svn: 358892
* llvm-undname: Fix an assert-on-invalid, found by oss-fuzzNico Weber2019-04-221-1/+1
| | | | llvm-svn: 358891
* AMDGPU: Fix not checking for copy when looking at copy srcMatt Arsenault2019-04-221-1/+6
| | | | | | | Effectively reverts r356956. The check for isFullCopy was excessive, but there still needs to be a check that this is a copy. llvm-svn: 358890
* [AMDGPU][MC] Corrected parsing of SP3 'neg' modifierDmitry Preobrazhensky2019-04-221-24/+58
| | | | | | | | | | See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60624 llvm-svn: 358888
* [TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handlingSimon Pilgrim2019-04-222-12/+50
| | | | | | | | | | | | This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGCombine but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. Differential Revision: https://reviews.llvm.org/D60462 llvm-svn: 358887
* [DAGCombiner] make variable name less ambiguous; NFCSanjay Patel2019-04-221-4/+4
| | | | llvm-svn: 358886
* [DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFCSanjay Patel2019-04-221-11/+16
| | | | llvm-svn: 358884
* [LLVM-C] Add accessors to the default floating-point metadata nodeRobert Widmann2019-04-221-0/+12
| | | | | | | | | | | | | | | | Summary: Add a getter and setter pair for floating-point accuracy metadata. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60527 llvm-svn: 358883
* [NewPM] Add Option handling for SimpleLoopUnswitchSerguei Katkov2019-04-222-1/+39
| | | | | | | | | | | This patch enables passing options to SimpleLoopUnswitch via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60676 llvm-svn: 358880
* Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC"Nikita Popov2019-04-224-15/+16
| | | | | | | | | | This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445. After thinking about this more, this isn't right, the range is not exact in the same sense as makeExactICmpRegion(). This needs a separate function. llvm-svn: 358876
* [ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFCNikita Popov2019-04-224-16/+15
| | | | | | | | Following D60632 makeGuaranteedNoWrapRegion() always returns an exact nowrap region. Rename the function accordingly. This is in line with the naming of makeExactICmpRegion(). llvm-svn: 358875
* [X86] Reject 512-bit types in getRegForInlineAsmConstraint when AVX512 is ↵Craig Topper2019-04-221-2/+5
| | | | | | not enabled. Same for 256 bit and AVX. llvm-svn: 358872
* [JITLink] Remove a lot of reduntant 'JITLink_' prefixes. NFC.Lang Hames2019-04-228-18/+22
| | | | llvm-svn: 358869
* [JITLink] Fix section start address calculation in eh-frame recorder.Lang Hames2019-04-221-0/+3
| | | | | | | | | | Section atoms are not sorted, so we need to scan the whole section to find the start address. No test case: Found by inspection, and any reproduction would depend on pointer ordering. llvm-svn: 358865
* llvm-undname: Fix hex escapes in wchar_t, char16_t, char32_t stringsNico Weber2019-04-211-3/+3
| | | | | | | | | | | | | | | | | llvm-undname used to put '\x' in front of every pair of nibbles, but u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct for a single character (plus terminating \0) is u\xD7FF instead. Now, wchar_t, char16_t, and char32_t strings roundtrip from source to clang-cl (and cl.exe) and then llvm-undname. (...at least as long as it's not a string like L"\xD7FF" L"foo" which gets demangled as L"\xD7FFfoo", where the compiler then considers the "f" as part of the hex escape. That seems ok.) Also add a comment saying that the "almost-valid" char32_t string I added in my last commit is actually produced by compilers. llvm-svn: 358857
* llvm-undname: Fix stack overflow on almost-validNico Weber2019-04-211-3/+3
| | | | | | | | | | | | | | | | | If a unsigned with all 4 bytes non-0 was passed to outputHex(), there were two off-by-ones in it: - Both MaxPos and Pos left space for the final \0, which left the buffer one byte to small. Set MaxPos to 16 instead of 15 to fix. - The `assert(Pos >= 0);` was after a `Pos--`, move it up one line. Since valid Unicode codepoints are <= 0x10ffff, this could never really happen in practice. Found by oss-fuzz. llvm-svn: 358856
* [ConstantRange] Add saturating add/sub methodsNikita Popov2019-04-211-0/+36
| | | | | | | | | | | | | | | | Add support for uadd_sat and friends to ConstantRange, so we can handle uadd.sat and friends in LVI. The implementation is forwarding to the corresponding APInt methods with appropriate bounds. One thing worth pointing out here is that the handling of wrapping ranges is not maximally accurate. A simple example is that adding 0 to a wrapped range will return a full range, rather than the original wrapped range. The tests also only check that the non-wrapping envelope is correct and minimal. Differential Revision: https://reviews.llvm.org/D60946 llvm-svn: 358855
* [ConstantRange] Add getNonEmpty() constructorNikita Popov2019-04-213-64/+19
| | | | | | | | | | | | | | ConstantRanges have an annoying special case: If upper and lower are the same, it can be either an empty or a full set. When constructing constant ranges nearly always a full set is intended, but this still requires an explicit check in many places. This revision adds a getNonEmpty() constructor that disambiguates this case: If upper and lower are the same, a full set is created. Differential Revision: https://reviews.llvm.org/D60947 llvm-svn: 358854
* llvm-undname: Fix stack overflow on invalid found by oss-fuzzNico Weber2019-04-211-1/+1
| | | | llvm-svn: 358852
* [ARM] Rewrite isLegalT2AddressImmediateDavid Green2019-04-211-29/+24
| | | | | | | | | | | | | | | | | | | | | This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 llvm-svn: 358845
* [X86] Add the rounding control operand to the printing for some scalar FMA ↵Craig Topper2019-04-211-1/+1
| | | | | | instructions. llvm-svn: 358844
* [CachePruning] Simplify comparatorFangrui Song2019-04-211-9/+2
| | | | llvm-svn: 358843
* [X86] Don't form masked vfpclass instruction from and+vfpclass unless the ↵Craig Topper2019-04-211-28/+36
| | | | | | fpclass only has a single use. llvm-svn: 358841
* [JITLink] Remove an overly strict error check in JITLink's eh-frame parser.Lang Hames2019-04-212-13/+4
| | | | | | | | The error check required FDEs to refer to the most recent CIE, but the eh-frame spec allows them to refer to any previously seen CIE. This patch removes the offending check. llvm-svn: 358840
* [JITLink] Factor basic common GOT and stub creation code into its own class.Lang Hames2019-04-212-72/+130
| | | | llvm-svn: 358838
OpenPOWER on IntegriCloud