summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert rr340111 "[GISel]: Add Legalization/lowering code for bit counting ↵Reid Kleckner2018-08-201-121/+1
| | | | | | | | operations" It causes LegalizerHelperTest.LowerBitCountingCTTZ1 to fail. llvm-svn: 340186
* [SelectionDAG] Reuse the Op's VT. NFCI.Simon Pilgrim2018-08-201-2/+2
| | | | llvm-svn: 340173
* [SelectionDAG] Add partial sign-bit support to ComputeNumSignBits for ↵Simon Pilgrim2018-08-201-2/+16
| | | | | | | | | | BITCAST nodes Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts. Handle the case where the sign bit extends to only part of the small elements. llvm-svn: 340169
* [SelectionDAG] Add basic demanded elements support to ComputeNumSignBits for ↵Simon Pilgrim2018-08-191-1/+8
| | | | | | | | | | BITCAST nodes Only adds support to the existing 'large element' scalar/vector to 'small element' vector bitcasts. The next step would be to support cases where the large elements aren't all sign bits, and determine the small element equivalent based on the demanded elements. llvm-svn: 340143
* [DebugInfo] In FastISel, convert llvm.dbg.label to DBG_LABEL MI.Hsiangkai Wang2018-08-181-0/+12
| | | | | | | | Convert llvm.dbg.label(!label_metadata) to DBG_LABEL !label_metadata. Differential Revision: https://reviews.llvm.org/D50622 llvm-svn: 340122
* [DAGCombiner] Allow divide by constant optimization on opaque constants.Craig Topper2018-08-181-2/+2
| | | | | | | | | | | | | | | | | Summary: I believe this restores the behavior we had before r339147. Fixes PR38622. Reviewers: RKSimon, chandlerc, spatel Reviewed By: chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D50936 llvm-svn: 340120
* [GISel]: Add Legalization/lowering code for bit counting operationsAditya Nandakumar2018-08-181-1/+121
| | | | | | | | | | https://reviews.llvm.org/D48847#inline-448257 Ported legalization expansions for CTLZ/CTTZ from DAG to GISel. Reviewed by rtereshin. llvm-svn: 340111
* DAG: Fix isKnownNeverNaN for basic non-sNaN casesMatt Arsenault2018-08-171-12/+7
| | | | | | | fadd/fsub/fmul need to worry about infinities as well as fdiv. llvm-svn: 340085
* [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)Hsiangkai Wang2018-08-1716-149/+407
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two forms for label debug information in DWARF format. 1. Labels in a non-inlined function: DW_TAG_label DW_AT_name DW_AT_decl_file DW_AT_decl_line DW_AT_low_pc 2. Labels in an inlined function: DW_TAG_label DW_AT_abstract_origin DW_AT_low_pc We will collect label information from DBG_LABEL. Before every DBG_LABEL, we will generate a temporary symbol to denote the location of the label. The symbol could be used to get DW_AT_low_pc afterwards. So, we create a mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase. The DBG_LABEL in the mapping is used to query the symbol before it. The AbstractLabels in DwarfCompileUnit is used to process labels in inlined functions. We also keep a mapping between scope and labels in DwarfFile to help to generate correct tree structure of DIEs. It also generates label debug information under global isel. Differential Revision: https://reviews.llvm.org/D45556 llvm-svn: 340039
* [AtomicExpandPass] Widen partword atomicrmw or/xor/and before tryExpandAtomicRMWAlex Bradbury2018-08-171-4/+48
| | | | | | | | | | | | | | | | | | | This patch performs a widening transformation of bitwise atomicrmw {or,xor,and} and applies it prior to tryExpandAtomicRMW. This operates similarly to convertCmpXchgToIntegerType. For these operations, the i8/i16 atomicrmw can be implemented in terms of the 32-bit atomicrmw by appropriately manipulating the operands. There is no functional change for the handling of partword or/xor, but the transformation for partword 'and' is new. The advantage of performing this transformation early is that the same code-path can be used regardless of the approach used to expand the atomicrmw (AtomicExpansionKind). i.e. the same logic is used for AtomicExpansionKind::CmpXchg and can also be used by the intrinsic-based expansion in D47882. Differential Revision: https://reviews.llvm.org/D48129 llvm-svn: 340027
* [DAGCombiner] extractShiftForRotate - fix out of range shift issueSimon Pilgrim2018-08-171-2/+2
| | | | | | | | | Don't just check for negative shift amounts. Fixes OSS Fuzz #9935 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=9935 llvm-svn: 340015
* [DAGCombine] Improve (sra (sra x, c1), c2) -> (sra x, (add c1, c2)) foldingSimon Pilgrim2018-08-171-17/+16
| | | | | | | | Add support for cases where only some c1+c2 results exceed the max bitshift, clamping accordingly. Differential Revision: https://reviews.llvm.org/D35722 llvm-svn: 340010
* Fix "control reaches end of non-void function" -Wreturn-type warning. NFCI.Simon Pilgrim2018-08-171-0/+1
| | | | llvm-svn: 340006
* [MISC]Fix wrong usage of std::equal()Chen Zheng2018-08-171-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D49958 llvm-svn: 340000
* Revert r339977: [GISel]: Add Opcodes for a few LLVM IntrinsicsChandler Carruth2018-08-171-10/+0
| | | | | | This is breaking ~all the bots. llvm-svn: 339982
* [GISel]: Add Opcodes for a few LLVM IntrinsicsAditya Nandakumar2018-08-171-0/+10
| | | | | | | | | | | https://reviews.llvm.org/D50401 Add opcodes for llvm.intrinsic.trunc, round, and update the IRTranslator for the same. Reviewed by: dsanders. llvm-svn: 339977
* DebugInfo: Remove command line (& target-based) disabling of pubnames in ↵David Blaikie2018-08-163-14/+2
| | | | | | | | | favor of metadata Now that Clang disables NVPTX pubnames via metadata there's no need for this fallback to target detection in the backend. llvm-svn: 339970
* [x86/MIR] Implement support for pre- and post-instruction symbols, asChandler Carruth2018-08-166-29/+210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | well as MIR parsing support for `MCSymbol` `MachineOperand`s. The only real way to test pre- and post-instruction symbol support is to use them in operands, so I ended up implementing that within the patch as well. I can split out the operand support if folks really want but it doesn't really seem worth it. The functional implementation of pre- and post-instruction symbols is now *completely trivial*. Two tiny bits of code in the (misnamed) AsmPrinter. It should be completely target independent as well. We emit these exactly the same way as we emit basic block labels. Most of the code here is to give full dumping, MIR printing, and MIR parsing support so that we can write useful tests. The MIR parsing of MC symbol operands still isn't 100%, as it forces the symbols to be non-temporary and non-local symbols with names. However, those names often can encode most (if not all) of the special semantics desired, and unnamed symbols seem especially annoying to serialize and de-serialize. While this isn't perfect or full support, it seems plenty to write tests that exercise usage of these kinds of operands. The MIR support for pre-and post-instruction symbols was quite straightforward. I chose to print them out in an as-if-operand syntax similar to debug locations as this seemed the cleanest way and let me use nice introducer tokens rather than inventing more magic punctuation like we use for memoperands. However, supporting MIR-based parsing of these symbols caused me to change the design of the symbol support to allow setting arbitrary symbols. Without this, I don't see any reasonable way to test things with MIR. Differential Revision: https://reviews.llvm.org/D50833 llvm-svn: 339962
* [DAGCombiner] Don't reassociate operations that have the vector reduction ↵Craig Topper2018-08-161-9/+13
| | | | | | | | | | | | | | flag set. When nodes are reassociated the vector-reduction flag gets lost. The test case is here is what would happen if you had a sum of absolute differences loop that started with a non-zero but contant sum and that loop was unrolled. The vectorizer will generate a constant vector for the initial value. And DAGCombiner reassociate tries to move it down the addition tree erasing the vector-reduction flag. Interestingly this moves constants the opposite direction of the reassociate IR pass. I've chosen to just punt on the reassociate, but I suppose we could maybe preserve the flag if both nodes have it set. Differential Revision: https://reviews.llvm.org/D50827 llvm-svn: 339946
* [MI] Change the array of `MachineMemOperand` pointers to beChandler Carruth2018-08-1612-162/+192
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a generically extensible collection of extra info attached to a `MachineInstr`. The primary change here is cleaning up the APIs used for setting and manipulating the `MachineMemOperand` pointer arrays so chat we can change how they are allocated. Then we introduce an extra info object that using the trailing object pattern to attach some number of MMOs but also other extra info. The design of this is specifically so that this extra info has a fixed necessary cost (the header tracking what extra info is included) and everything else can be tail allocated. This pattern works especially well with a `BumpPtrAllocator` which we use here. I've also added the basic scaffolding for putting interesting pointers into this, namely pre- and post-instruction symbols. These aren't used anywhere yet, they're just there to ensure I've actually gotten the data structure types correct. I'll flesh out support for these in a subsequent patch (MIR dumping, parsing, the works). Finally, I've included an optimization where we store any single pointer inline in the `MachineInstr` to avoid the allocation overhead. This is expected to be the overwhelmingly most common case and so should avoid any memory usage growth due to slightly less clever / dense allocation when dealing with >1 MMO. This did require several ergonomic improvements to the `PointerSumType` to reasonably support the various usage models. This also has a side effect of freeing up 8 bits within the `MachineInstr` which could be repurposed for something else. The suggested direction here came largely from Hal Finkel. I hope it was worth it. ;] It does hopefully clear a path for subsequent extensions w/o nearly as much leg work. Lots of thanks to Reid and Justin for careful reviews and ideas about how to do all of this. Differential Revision: https://reviews.llvm.org/D50701 llvm-svn: 339940
* DebugInfo: Add metadata support for disabling DWARF pub sectionsDavid Blaikie2018-08-165-38/+68
| | | | | | | | | | | | | | | | | | | | | | | In cases where the debugger load time is a worthwhile tradeoff (or less costly - such as loading from a DWP instead of a variety of DWOs (possibly over a high-latency/distributed filesystem)) against object file size, it can be reasonable to disable pubnames and corresponding gdb-index creation in the linker. A backend-flag version of this was implemented for NVPTX in D44385/r327994 - which was fine for NVPTX which wouldn't mix-and-match CUs. Now that it's going to be a user-facing option (likely powered by "-gno-pubnames", the same as GCC) it should be encoded in the DICompileUnit so it can vary per-CU. After this, likely the NVPTX support should be migrated to the metadata & the previous flag implementation should be removed. Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D50213 llvm-svn: 339939
* [MachineVerifier] Check if predecessor is jointly dominated by undefsKrzysztof Parzyszek2018-08-161-1/+11
| | | | | | | | Each use of a value should be jointly dominated by the union of defs and undefs. It can happen that it will only be jointly dominated by undefs, and that is still legal. Make sure that the verifier is aware of that. llvm-svn: 339924
* [SelectionDAG] Improve the legalisation lowering of UMULO.Eli Friedman2018-08-161-17/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no way in the universe, that doing a full-width division in software will be faster than doing overflowing multiplication in software in the first place, especially given that this same full-width multiplication needs to be done anyway. This patch replaces the previous implementation with a direct lowering into an overflowing multiplication algorithm based on half-width operations. Correctness of the algorithm was verified by exhaustively checking the output of this algorithm for overflowing multiplication of 16 bit integers against an obviously correct widening multiplication. Baring any oversights introduced by porting the algorithm to DAG, confidence in correctness of this algorithm is extremely high. Following table shows the change in both t = runtime and s = space. The change is expressed as a multiplier of original, so anything under 1 is “better” and anything above 1 is worse. +-------+-----------+-----------+-------------+-------------+ | Arch | u64*u64 t | u64*u64 s | u128*u128 t | u128*u128 s | +-------+-----------+-----------+-------------+-------------+ | X64 | - | - | ~0.5 | ~0.64 | | i686 | ~0.5 | ~0.6666 | ~0.05 | ~0.9 | | armv7 | - | ~0.75 | - | ~1.4 | +-------+-----------+-----------+-------------+-------------+ Performance numbers have been collected by running overflowing multiplication in a loop under `perf` on two x86_64 (one Intel Haswell, other AMD Ryzen) based machines. Size numbers have been collected by looking at the size of function containing an overflowing multiply in a loop. All in all, it can be seen that both performance and size has improved except in the case of armv7 where code size has regressed for 128-bit multiply. u128*u128 overflowing multiply on 32-bit platforms seem to benefit from this change a lot, taking only 5% of the time compared to original algorithm to calculate the same thing. The final benefit of this change is that LLVM is now capable of lowering the overflowing unsigned multiply for integers of any bit-width as long as the target is capable of lowering regular multiplication for the same bit-width. Previously, 128-bit overflowing multiply was the widest possible. Patch by Simonas Kazlauskas! Differential Revision: https://reviews.llvm.org/D50310 llvm-svn: 339922
* [RegisterCoalescer] Shrink to uses if needed after removeCopyByCommutingDefKrzysztof Parzyszek2018-08-161-24/+54
| | | | llvm-svn: 339912
* [TargetLowering] Add support for non-uniform vectors to BuildSDIVSimon Pilgrim2018-08-161-10/+24
| | | | | | | | | | This patch refactors the existing TargetLowering::BuildSDIV base implementation to support non-uniform constant vector denominators. This is the last patch necessary to close PR36545 Differential Revision: https://reviews.llvm.org/D50765 llvm-svn: 339908
* [TargetLowering] Refactor BuildSDIV in preparation for D50765. NFCI.Simon Pilgrim2018-08-161-24/+36
| | | | | | Pull out magic factor calculators into a helper function, use 0/+1/-1 multiplication factor to (optionally) add/sub the numerator. llvm-svn: 339898
* [CodeGenPrepare] Add BothExtension type to PromotedInstsGuozhi Wei2018-08-151-7/+49
| | | | | | | | | | | | This patch fixes PR38125. Instruction extension types are recorded in PromotedInsts, it can be used later in function canGetThrough. If an instruction has two users with different extension types, it will be inserted into PromotedInsts two times in function promoteOperandForOther. The second one overwrites the first one, and the final extension type is wrong, later causes problem in canGetThrough. This patch changes the simple bool extension type to 2-bit enum type, add a BothExtension type in addition to zero/sign extension. When an user sees BothExtension for an instruction, it actually knows nothing about how that instruction is extended. Differential Revision: https://reviews.llvm.org/D49512 llvm-svn: 339822
* DAG: Use getObjectOffset helperMatt Arsenault2018-08-151-4/+1
| | | | llvm-svn: 339813
* DAG: Try to custom lower when promoting float operandsMatt Arsenault2018-08-151-0/+5
| | | | | | | For some reason this wasn't done for floats like integers. llvm-svn: 339811
* [RegisterCoalescer] Ensure that both registers have subranges if one doesKrzysztof Parzyszek2018-08-151-1/+4
| | | | llvm-svn: 339792
* [RegisterCoalescer] Reset VNInfo def when copying segments overKrzysztof Parzyszek2018-08-151-2/+6
| | | | llvm-svn: 339788
* [RegAlloc] Check that subreg liveness tracking applies to given virtual regKrzysztof Parzyszek2018-08-151-1/+1
| | | | | | | | Subregister liveness applies selectively to register classes with certain properties. Make sure that when it's enabled, it applies to a given virtual register (in virtual register rewriter). llvm-svn: 339784
* [TargetLowering] Minor cleanup of TargetLowering::BuildSDIV. NFCI.Simon Pilgrim2018-08-151-21/+20
| | | | | | Pull out some types to match layout in TargetLowering::BuildUDIV. Early step towards adding non-uniform vector support. llvm-svn: 339763
* [TargetLowering] Minor refactor to TargetLowering::BuildUDIV to merge ↵Simon Pilgrim2018-08-151-41/+31
| | | | | | | | scalar/vector magic value collection. NFCI. Use the same ISD::matchUnaryPredicate pattern that was used in D50392. llvm-svn: 339758
* [DagCombiner] Don't bother adding to the work list if TLI.BuildSDIVPow2 ↵Simon Pilgrim2018-08-151-4/+6
| | | | | | | | failed. NFCI. Matches the code in BuildSDIV/BuildUDIV llvm-svn: 339757
* [TargetLowering] Add support for non-uniform vectors to BuildExactSDIVSimon Pilgrim2018-08-151-12/+24
| | | | | | | | This patch refactors the existing BuildExactSDIV implementation to support non-uniform constant vector denominators. Differential Revision: https://reviews.llvm.org/D50392 llvm-svn: 339756
* [SDAG] Remove the reliance on MI's allocation strategy forChandler Carruth2018-08-146-40/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `MachineMemOperand` pointers attached to `MachineSDNodes` and instead have the `SelectionDAG` fully manage the memory for this array. Prior to this change, the memory management was deeply confusing here -- The way the MI was built relied on the `SelectionDAG` allocating memory for these arrays of pointers using the `MachineFunction`'s allocator so that the raw pointer to the array could be blindly copied into an eventual `MachineInstr`. This creates a hard coupling between how `MachineInstr`s allocate their array of `MachineMemOperand` pointers and how the `MachineSDNode` does. This change is motivated in large part by a change I am making to how `MachineFunction` allocates these pointers, but it seems like a layering improvement as well. This would run the risk of increasing allocations overall, but I've implemented an optimization that should avoid that by storing a single `MachineMemOperand` pointer directly instead of allocating anything. This is expected to be a net win because the vast majority of uses of these only need a single pointer. As a side-effect, this makes the API for updating a `MachineSDNode` and a `MachineInstr` reasonably different which seems nice to avoid unexpected coupling of these two layers. We can map between them, but we shouldn't be *surprised* at where that occurs. =] Differential Revision: https://reviews.llvm.org/D50680 llvm-svn: 339740
* [FPEnv] Scalarize StrictFP vector operationsCameron McInally2018-08-142-0/+50
| | | | | | | | Add a helper function to scalarize constrained FP operations as needed. Differential Revision: https://reviews.llvm.org/D50720 llvm-svn: 339735
* [ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly.Eli Friedman2018-08-141-2/+3
| | | | | | | | | | | | | | | | | | | | | Intentionally excluding nodes from the DAGCombine worklist is likely to lead to weird optimizations and infinite loops, so it's generally a bad idea. To avoid the infinite loops, fix DAGCombine to use the isDesirableToCommuteWithShift target hook before performing the transforms in question, and implement the target hook in the ARM backend disable the transforms in question. Fixes https://bugs.llvm.org/show_bug.cgi?id=38530 . (I don't have a reduced testcase for that bug. But we should have sufficient test coverage for PerformSHLSimplify given that we're not playing weird tricks with the worklist. I can try to bugpoint it if necessary, though.) Differential Revision: https://reviews.llvm.org/D50667 llvm-svn: 339734
* [DebugInfoMetadata] Added DIFlags interface in DIBasicType.Adrian Prantl2018-08-141-0/+5
| | | | | | | | | | | Flags in DIBasicType will be used to pass attributes used in DW_TAG_base_type, such as DW_AT_endianity. Patch by Chirag Patel! Differential Revision: https://reviews.llvm.org/D49610 llvm-svn: 339714
* Revert "[DebugInfo] Generate DWARF debug information for labels. (Fix leak ↵Bruno Cardoso Lopes2018-08-1416-407/+149
| | | | | | | | | | | | problems)" This reverts commit cb8c5e417d55141f3f079a8a876e786f44308336 / r339676. This causing a test to fail in http://green.lab.llvm.org/green/job/clang-stage1-configure-RA/48406/ LLVM :: DebugInfo/Generic/debug-label.ll llvm-svn: 339700
* [DAG] Avoid redundant chain transversal in store merge cycle check. NFCI.Nirav Dave2018-08-141-1/+2
| | | | | | Patch by Henric Karlsson. llvm-svn: 339688
* [DebugInfo] Generate DWARF debug information for labels. (Fix leak problems)Hsiangkai Wang2018-08-1416-149/+407
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are two forms for label debug information in DWARF format. 1. Labels in a non-inlined function: DW_TAG_label DW_AT_name DW_AT_decl_file DW_AT_decl_line DW_AT_low_pc 2. Labels in an inlined function: DW_TAG_label DW_AT_abstract_origin DW_AT_low_pc We will collect label information from DBG_LABEL. Before every DBG_LABEL, we will generate a temporary symbol to denote the location of the label. The symbol could be used to get DW_AT_low_pc afterwards. So, we create a mapping between 'inlined label' and DBG_LABEL MachineInstr in DebugHandlerBase. The DBG_LABEL in the mapping is used to query the symbol before it. The AbstractLabels in DwarfCompileUnit is used to process labels in inlined functions. We also keep a mapping between scope and labels in DwarfFile to help to generate correct tree structure of DIEs. It also generates label debug information under global isel. Differential Revision: https://reviews.llvm.org/D45556 llvm-svn: 339676
* [GlobalISel][IRTranslator] Fix a bug in handling repeating struct types ↵Amara Emerson2018-08-141-0/+2
| | | | | | | | during argument lowering. Differential Revision: https://reviews.llvm.org/D49442 llvm-svn: 339674
* [CodeGen] Fix assert in SelectionDAG::computeKnownBitsScott Linder2018-08-131-2/+2
| | | | | | | | | | Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR when zero extending the demanded elements mask if it is already as long as the source vector. Differential Revision: https://reviews.llvm.org/D49574 llvm-svn: 339600
* [DAGCombiner] simplifyDivRem - add comment describing divide by undef/zero ↵Simon Pilgrim2018-08-131-0/+5
| | | | | | combine. NFC. llvm-svn: 339561
* [CGP] Fix GEP issue with out of range APInt constant values not fitting in ↵Simon Pilgrim2018-08-131-2/+7
| | | | | | | | int64_t Test case reduced from https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7173 llvm-svn: 339556
* [SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the ↵Craig Topper2018-08-131-8/+11
| | | | | | | | fp_to_fp16 in case the result type isn't a scalar integer. This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary. llvm-svn: 339536
* [SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, ↵Craig Topper2018-08-131-2/+2
| | | | | | | | | | make sure the output type is scalar. For vectors, use a store and load of temporary. Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid. This is basically the opposite case of the root cause of PR38533. llvm-svn: 339535
* Restore correct x86_64 EH encodings in kernel code modelLei Liu2018-08-131-9/+14
| | | | | | | | | | | | | Fixes PR37524. The exception handling encodings for x86_64 in kernel code model has been changed with r309884. Restore it to correct ones. These encodings include PersonalityEncoding, LSDAEncoding and TTypeEncoding. Differential Revision: https://reviews.llvm.org/D50490 llvm-svn: 339534
OpenPOWER on IntegriCloud