summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Mark vector STRICT_FP_ROUND as Legal instead of Custom.Craig Topper2019-11-201-3/+9
| | | | | | | | | The Custom handler doesn't do anything for these nodes anyway. SelectionDAGISel won't mutate them if they are Legal or Custom. X86 has custom code for mutating them due to missing isel patterns. When the isel patterns are added Legal will be the right answer. So go ahead a change it now since that's where we'll end up.
* Move widenable branch formation into makeGuardControlFlowExplicit helperPhilip Reames2019-11-203-17/+16
| | | | This is mostly NFC, but I removed the setting of the guard's calling convention onto the WC call. Why? Because it was untested, and was producing an ill defined output as the declaration's convention wasn't been changed leaving a mismatch which is UB.
* [AMDGPU] Keep consistent check of legal addressing mode.Michael Liao2019-11-202-14/+11
| | | | | | | | | | | | | | Summary: - Add test cases for GFX10, which has narrower offset range compared to GFX9. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70473
* [SelectionDAG][X86] Mutate strictFP nodes to non-strict in ↵Craig Topper2019-11-202-1/+8
| | | | | | | | | | DoInstructionSelection when the node is marked Expand rather than when it is not Legal. This allows operations that are marked Custom, but have some type combinations that are legal to get past this code. Add custom mutation code to X86's Select function for the nodes that don't have isel patterns yet.
* A fix of the bug introduced by previous lowering in asm patch.Xiangling Liao2019-11-201-2/+1
| | | | Differential Revision: https://reviews.llvm.org/D70243
* [AIX][XCOFF] Add support for generating assembly code for one-byte mergable ↵Xing Xue2019-11-202-3/+22
| | | | | | | | | | | | | | | | | | strings This patch adds support for generating assembly code for one-byte mergeable strings. Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later. Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L Reviewed by: daltenty Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70310
* [AIX] Lowering jump table, constant pool and block address in asmXiangling Liao2019-11-204-7/+48
| | | | | | | | | | This patch lowering jump table, constant pool and block address in assembly. 1. On AIX, jump table index is always relative; 2. Put CPI and JTI into ReadOnlySection until we support unique data sections; 3. Create the temp symbol for block address symbol; 4. Update MIR testcases and add related assembly part; Differential Revision: https://reviews.llvm.org/D70243
* Recommit "[DWARF] Add an api to get "interpreted" location lists"Pavel Labath2019-11-204-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This recommits 089c0f581492cd6e2a3d2927be3fbf60ea2d7e62, which was reverted due to failing tests on big endian machines. It includes a fix which I believe (I don't have BE machine) should fix this issue. The fix consists of correcting the invocation DWARFYAML::EmitDebugSections, which was missing one (default) function arguments, and so didn't actually force the little-endian mode. The original commit message follows. Summary: This patch adds DWARFDie::getLocations, which returns the location expressions for a given attribute (typically DW_AT_location). It handles both "inline" locations and references to the external location list sections (currently only of the DW_FORM_sec_offset type). It is implemented on top of DWARFUnit::findLoclistFromOffset, which is also added in this patch. I tried to make their signatures similar to the equivalent range list functionality. The actual location list interpretation logic is in DWARFLocationTable::visitAbsoluteLocationList. This part is not equivalent to the range list code, but this deviation is motivated by a desire to reuse the same location list parsing code within lldb. The functionality is tested via a c++ unit test of the DWARFDie API. Reviewers: dblaikie, JDevlieghere, SouraVX Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl Tags: #llvm Differential Revision: https://reviews.llvm.org/D70394
* [AMDGPU][GFX10] Disabled v_movrel*[sdwa|dpp] opcodes in codegenDmitry Preobrazhensky2019-11-202-0/+27
| | | | | | | | These opcodes use indirect register addressing so they need special handling by codegen (currently missing). Reviewers: vpykhtin, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70400
* [mips] Define mem_simm# operands using tblgen `foreach` loop. NFCSimon Atanasyan2019-11-201-29/+5
|
* [SelectionDAG] Combine U{ADD,SUB}O diamonds into {ADD,SUB}CARRYDavid Zarzycki2019-11-201-0/+99
| | | | | | | | | | | | | | | | | Summary: Convert (uaddo (uaddo x, y), carryIn) into addcarry x, y, carryIn if-and-only-if the carry flags of the first two uaddo are merged via OR or XOR. Work remaining: match ADD, etc. Reviewers: craig.topper, RKSimon, spatel, niravd, jonpa, uweigand, deadalnix, nikic, lebedev.ri, dmgreen, chfast Reviewed By: lebedev.ri Subscribers: chfast, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70079
* Revert "[DWARF] Add an api to get "interpreted" location lists"Pavel Labath2019-11-204-64/+1
| | | | | | | The test fails on big endian machines. This reverts commit 089c0f581492cd6e2a3d2927be3fbf60ea2d7e62 and the subsequent attempt to fix in 82dc32e2d456c75d08bc9ffe97def409ee5a03cd.
* [ARM][MVE] Select vqabsAnna Welker2019-11-201-0/+35
| | | | | | | | Adds a pattern to ARMInstrMVE.td to use a VQABS instruction if an equivalent multi-instruction construct is found. Differential revision: https://reviews.llvm.org/D70181
* [mips] Put conditions when we need to expand memory operand into a separate ↵Simon Atanasyan2019-11-201-29/+36
| | | | | | | | function. NFC `expandMemInst` expects instruction with 3 or 4 operands and the last operand requires expanding. It's redundant to scan all operands in a loop. We can check the last operands.
* [mips] Make MipsAsmParser::isEvaluated static function. NFCSimon Atanasyan2019-11-201-21/+20
|
* [AMDGPU][DPP] Corrected DPP combinerDmitry Preobrazhensky2019-11-201-6/+9
| | | | | | | | Added a check to make sure that the selected dpp opcode is supported by target. Reviewers: vpykhtin, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D70402
* [DWARF] Add an api to get "interpreted" location listsPavel Labath2019-11-204-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds DWARFDie::getLocations, which returns the location expressions for a given attribute (typically DW_AT_location). It handles both "inline" locations and references to the external location list sections (currently only of the DW_FORM_sec_offset type). It is implemented on top of DWARFUnit::findLoclistFromOffset, which is also added in this patch. I tried to make their signatures similar to the equivalent range list functionality. The actual location list interpretation logic is in DWARFLocationTable::visitAbsoluteLocationList. This part is not equivalent to the range list code, but this deviation is motivated by a desire to reuse the same location list parsing code within lldb. The functionality is tested via a c++ unit test of the DWARFDie API. Reviewers: dblaikie, JDevlieghere, SouraVX Subscribers: mgorny, hiraditya, cmtice, probinson, llvm-commits, aprantl Tags: #llvm Differential Revision: https://reviews.llvm.org/D70394
* [DebugInfo] Remove the DIFlagArgumentNotModified debug info flagDjordje Todorovic2019-11-201-1/+1
| | | | | | | Due to changes in D68206, we remove the DIFlagArgumentNotModified and its usage. Differential Revision: https://reviews.llvm.org/D68207
* Move floating point related entities to namespace levelSerge Pavlov2019-11-204-67/+82
| | | | | | | | | | | | | | | | | | This is recommit of commit e6584b2b7b2d, which was reverted in 30e7ee3c4bac together with af57dbf12e54. Original message is below. Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability. This change moves these definitioins to the namespace llvm::fp. No functional changes. Differential Revision: https://reviews.llvm.org/D69552
* [AMDGPU] add support for hostcall buffer pointer as hidden kernel argumentSameer Sahasrabuddhe2019-11-204-2/+23
| | | | | | | | | | | Hostcall is a service that allows a kernel to submit requests to the host using shared buffers, and block until a response is received. This will eventually replace the shared buffer currently used for printf, and repurposes the same hidden kernel argument. This change introduces a new ValueKind in the HSA metadata to represent the hostcall buffer. Differential Revision: https://reviews.llvm.org/D70038
* [ExecutionEngine] Add a missing break to avoid warningsMartin Storsjö2019-11-201-0/+1
| | | | This fixes buildbot errors since dc3ee330891c2.
* ExecutionEngine: add preliminary support for COFF ARM64Adam Kallai2019-11-203-1/+371
| | | | Differential Revision: https://reviews.llvm.org/D69434
* [FEnv] File with properties of constrained intrinsicsSerge Pavlov2019-11-207-460/+99
| | | | | | | | | | | | | | | | | | Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss some of the cases. To make working with these intrinsics more convenient and robust, this change introduces file containing definitions of all constrained intrinsics and some of their properties. This file can be included to generate constrained intrinsics processing code. Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69887
* AMDGPU/GlobalISel: Legalize FDIV64Austin Kerbow2019-11-192-0/+87
| | | | | | | | | | | | Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70403
* [musttail] Don't forward AL on Win64Reid Kleckner2019-11-191-2/+2
| | | | | | | | | | | | | | | | AL is only used for varargs on SysV platforms. Don't forward it on Windows. This allows control flow guard to set up an extra hidden parameter in RAX, as described in PR44049. This also has the effect of freeing up RAX for use in virtual member pointer thunks, which may also be a nice little code size improvement on Win64. Fixes PR44049 Reviewers: ajpaverd, efriedma, hans Differential Revision: https://reviews.llvm.org/D70413
* [LTO][Legacy] Add API for passing LLVM options separatelyFrancis Visoiu Mistrih2019-11-191-6/+3
| | | | | | | | | | | | In order to correctly pass options to LLVM, including options containing spaces which are used as delimiters for multiple options in lto_codegen_debug_options, add a new API: lto_codegen_debug_options_array. Unfortunately, tools/lto has no testing infrastructure yet, so there are no tests associated with this patch. Differential Revision: https://reviews.llvm.org/D70463
* [LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promotedCraig Topper2019-11-192-20/+33
| | | | Differential Revision: https://reviews.llvm.org/D70220
* [X86] Add custom type legalization and lowering for scalar ↵Craig Topper2019-11-192-30/+119
| | | | | | | | | | | | | | | STRICT_FP_TO_SINT/UINT This is a first pass at Custom lowering for these operations. I also updated some of the vector code where it was obviously easy and straightforward. More work needed in follow up. This enables these operations to be handled with X87 where special rounding control adjustments are needed to perform a truncate. Still need to fix Promotion in the target independent code in LegalizeDAG. llrint/llround split into separate test file because we can't make a strict libcall properly yet either and we need to do that when i64 isn't a legal type. This does not include any isel support. So we still rely on the mutation in SelectionDAGIsel to remove the strict from this stuff later. Except for the X87 stuff which goes through custom nodes that already had chains. Differential Revision: https://reviews.llvm.org/D70214
* [ARC] Add InitializePasses header to fix ARC build.Pete Couperus2019-11-192-0/+2
|
* [GuardWidening] Remove WidenFrequentBranches transformPhilip Reames2019-11-191-68/+6
| | | | This code has never been enabled. While it is tested, it's complicating some refactoring. If we decide to re-implement this, doing it in SimplifyCFG would probably make more sense anyways.
* [NFC] Factor out utilities for manipulating widenable branchesPhilip Reames2019-11-194-13/+37
| | | | | | With the widenable condition construct, we have the ability to reason about branches which can be 'widened' (i.e. made to fail more often). We've got a couple o transforms which leverage this. This patch just cleans up the API a bit. This is prep work for generalizing our definition of a widenable branch slightly. At the moment "br i1 (and A, wc()), ..." is considered widenable, but oddly, neither "br i1 (and wc(), B), ..." or "br i1 wc(), ..." is. That clearly needs addressed, so first, let's centralize the code in one place.
* [LoopPred] Generalize profitability check to handle unswitch outputPhilip Reames2019-11-191-1/+12
| | | | Unswitch (and other loop transforms) like to generate loop exit blocks with unconditional successors, and phi nodes (LCSSA, or simple multiple exiting blocks sharing an exit). Generalize the "likely very rare exit" check slightly to handle this form.
* [ValueTracking] Add a basic version of isKnownNonInfinity and use it to ↵Benjamin Kramer2019-11-191-4/+69
| | | | detect more NoNaNs
* The patch is the compiler error specific on the compile error on CMVCdiggerlin2019-11-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | SUMMARY: CMVC has a compiler error on the const uint64_t OffsetToRaw = is64Bit() ? toSection64(Sec)->FileOffsetToRawData : toSection32(Sec)->FileOffsetToRawData; while gcc compiler do not have the problem. I have to change the code to uint64_t OffsetToRaw; if (is64Bit()) OffsetToRaw = toSection64(Sec)->FileOffsetToRawData; else OffsetToRaw = toSection32(Sec)->FileOffsetToRawData; Reviewers: Sean Fertile Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70255
* [DebugInfo] Describe size of spilled values in call site paramsVedant Kumar2019-11-191-1/+5
| | | | | | | | A call site parameter description of a memory operand needs to unambiguously convey the size of the operand to prevent incorrect entry value evaluation. Thanks for David Stenberg for pointing this issue out!
* llvm/ObjCARC: Eliminate inlined AutoreleaseRV callsDuncan P. N. Exon Smith2019-11-191-71/+145
| | | | | | | | | | | | | | | | | | | | | | | | | Pair up inlined AutoreleaseRV calls with their matching RetainRV or ClaimRV. - RetainRV cancels out AutoreleaseRV. Delete both instructions. - ClaimRV is a peephole for RetainRV+Release. Delete AutoreleaseRV and replace ClaimRV with Release. This avoids problems where more aggressive inlining triggers memory regressions. This patch is happy to skip over non-callable instructions and non-ARC intrinsics looking for the pair. It is likely sound to also skip over opaque function calls, but that's harder to reason about, and it's not relevant to the goal here: if there's an opaque function call splitting up a pair, it's very unlikely that a handshake would have happened dynamically without inlining. Note that this patch also subsumes the previous logic that looked backwards from ReleaseRV. https://reviews.llvm.org/D70370 rdar://problem/46509586
* [SLP] fix miscompile on min/max reductions with extra uses (PR43948) (2nd try)Sanjay Patel2019-11-191-10/+25
| | | | | | | | | | | | | | | | | | | | | | The 1st attempt was reverted because it revealed an existing bug where we could produce invalid IR (use of value before definition). That should be fixed with: rG39de82ecc9c2 The bug manifests as replacing a reduction operand with an undef value. The problem appears to be limited to cases where a min/max reduction has extra uses of the compare operand to the select. In the general case, we are tracking "ExternallyUsedValues" and an "IgnoreList" of the reduction operations, but those may not apply to the final compare+select in a min/max reduction. For that, we use replaceAllUsesWith (RAUW) to ensure that the new vectorized reduction values are transferred to all subsequent users. Differential Revision: https://reviews.llvm.org/D70148
* MTE: add more unchecked instructions.Evgenii Stepanov2019-11-191-3/+29
| | | | | | | | | | | | | | Summary: In particular, 1- and 2-byte loads and stores ignore the pointer tag when using SP as the base register. Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70341
* [ARM] MVE interleaving load and stores.David Green2019-11-193-41/+107
| | | | | | | | | | | | | | | | | | | | Now that we have the intrinsics, we can add VLD2/4 and VST2/4 lowering for MVE. This works the same way as Neon, recognising the load/shuffles combination and converting them into intrinsics in a pre-isel pass, which just calls getMaxSupportedInterleaveFactor, lowerInterleavedLoad and lowerInterleavedStore. The main difference to Neon is that we do not have a VLD3 instruction. Otherwise most of the code works very similarly, with just some minor differences in the form of the intrinsics to work around. VLD3 is disabled by making isLegalInterleavedAccessType return false for those cases. We may need some other future adjustments, such as VLD4 take up half the available registers so should maybe cost more. This patch should get the basics in though. Differential Revision: https://reviews.llvm.org/D69392
* implement printing out raw section data of xcoff objectfile for llvm-objdumpdiggerlin2019-11-191-8/+17
| | | | | | | | | | | SUMMARY: implement printing out raw section data of xcoff objectfile for llvm-objdump and option -D --disassemble-all option for llvm-objdump Reviewers: Sean Fertile Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70255
* Work on cleaning up denormal mode handlingMatt Arsenault2019-11-192-3/+17
| | | | | | | | | | | | | | | | | | | | | | Cleanup handling of the denormal-fp-math attribute. Consolidate places checking the allowed names in one place. This is in preparation for introducing FP type specific variants of the denormal-fp-mode attribute. AMDGPU will switch to using this in place of the current hacky use of subtarget features for the denormal mode. Introduce a new header for dealing with FP modes. The constrained intrinsic classes define related enums that should also be moved into this header for uses in other contexts. The verifier could use a check to make sure the denorm-fp-mode attribute is sane, but there currently isn't one. Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by default in the one current user. Clang must be taught to start emitting this attribute by default to avoid regressions when this is switched to assume ieee behavior if the attribute isn't present.
* [AIX][XCOFF] Write Function descriptors and TOC base to data sectionjasonliu2019-11-191-6/+13
| | | | | | This patch implements writing function descriptors and TOC base into data section, and also add function descriptors(both csect and label) and TOC base symbols to the symbol table.
* [SLP] fix insertion point for min/max reductionSanjay Patel2019-11-191-2/+17
| | | | | | As discussed in D70148 (and caused a revert of the original commit): if we insert at the select, then we can produce invalid IR because the replacement for the compare may have uses before the select.
* [ARM,MVE] Add intrinsics for scalar shifts.Simon Tatham2019-11-192-8/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fills in the small family of MVE intrinsics that have nothing to do with vectors: they implement bit-shift operations on 32- or 64-bit values held in one or two general-purpose registers. Most of these shift operations saturate if shifting left, and round to nearest if shifting right, although LSLL and ASRL behave like ordinary shifts. When these instructions take a variable shift count in a register, they pay attention to its sign, so that (for example) LSLL or UQRSHLL will shift left if given a positive number but right if given a negative one. That makes even LSLL and ASRL different enough from standard LLVM IR shift semantics that I couldn't see any better alternative than to simply model the whole family as a set of MVE-specific IR intrinsics. (The //immediate// forms of LSLL and ASRL, on the other hand, do behave exactly like a standard IR shift of a 64-bit value. In fact, those forms don't have ACLE intrinsics defined at all, because you can just write an ordinary C shift operation if you want one of those.) The 64-bit shifts have to be instruction-selected in C++, because they deliver two output values. But the 32-bit ones are simple enough that I could write a DAG isel pattern directly into each Instruction record. Reviewers: ostannard, MarkMurrayARM, dmgreen Reviewed By: dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70319
* AMDGPU: Refactor treatment of denormal modeMatt Arsenault2019-11-1918-78/+141
| | | | | | | | | | | Start moving towards treating this as a property of the calling convention, and not the subtarget. The default denormal mode should not be part of the subtarget, and be moved into a separate function attribute. This patch is still NFC. The denormal mode remains as a subtarget feature for now, but make the necessary changes to switch to using an attribute.
* AMDGPU: Be explicit about denormal mode in MIR testsMatt Arsenault2019-11-191-10/+16
| | | | | | | Start checking the machine function in GlobalISel instead of the target directly. This temporarily breaks fcanonicalize selection in GlobalISel.
* DAG: Add function context to isFMAFasterThanFMulAndFAddMatt Arsenault2019-11-1918-26/+56
| | | | | | | | AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.
* [AMDGPU] Tune inlining parameters for AMDGPU target (part 2)dfukalov2019-11-193-3/+3
| | | | | | | | | | | | | | | | | Summary: Most of IR instructions got better code size estimations after commit 47a5c36b. So default parameters values should be updated to improve inlining and unrolling for the target. Reviewers: rampitec, arsenm Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70391
* [ThinLTO] Simplify code. NFCevgeny2019-11-191-5/+4
|
* [X86][SSE] Remove XFormVExtractWithShuffleIntoLoad to prevent legalization ↵Simon Pilgrim2019-11-191-122/+2
| | | | | | | | infinite loops (PR43971) As detailed in PR43971/D70267, the use of XFormVExtractWithShuffleIntoLoad causes issues where we end up in infinite loops of extract(targetshuffle(vecload)) -> extract(shuffle(vecload)) -> extract(vecload) -> extract(targetshuffle(vecload)), there are just too many legalization checks at every stage that we can't guarantee that extract(shuffle(vecload)) -> scalarload can occur. At the moment we see a number of minor regressions as we don't fold extract(shuffle(vecload)) -> scalarload before legal ops, these can be addressed in future patches and extension of X86ISelLowering's combineExtractWithShuffle.
OpenPOWER on IntegriCloud