summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Address post commit review comments on revision 366727.Sean Fertile2019-07-301-5/+5
| | | | | | | | | | | | | Addresses number of comment made on D64652 after commiting: - Reorders function decls in the TargetLoweringObjectFileXCOFF class. - Fix comment in MCSectionXCOFF to include description of external reference csects. - Convert several llvm_unreachables to report_fatal_error - Convert several dyn_casts to casts as they are expected not to fail. - Avoid copying DataLayout object. llvm-svn: 367324
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for binops that ↵Simon Pilgrim2019-07-291-0/+4
| | | | | | | | change value type. NFCI. This is implicit in the value type checks in getSubVectorSrc - this just makes it upfront and obvious. llvm-svn: 367220
* [DAGCombine] narrowInsertExtractVectorBinOp - early out for illegal op. NFCI.Simon Pilgrim2019-07-271-1/+4
| | | | | | If the subvector binop is illegal then early-out and avoid the subvector searches. llvm-svn: 367181
* [TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵Simon Pilgrim2019-07-271-2/+59
| | | | | | | | | | | | support (Reapplied) This allows us to peek through BITCASTs, attempt to simplify the source operand, and then bitcast back. This reapplies rL367091 which was reverted at rL367118 - we were inconsistently peeking through the bitcasts to the source value. Fixes PR42777 llvm-svn: 367174
* [SelectionDAG] Check for any recursion depth greater than or equal to limit ↵Simon Pilgrim2019-07-272-5/+5
| | | | | | | | | | | | instead of just equal the limit. If anything called the recursive isKnownNeverNaN/computeKnownBits/ComputeNumSignBits/SimplifyDemandedBits/SimplifyMultipleUseDemandedBits with an incorrect depth then we could continue to recurse if we'd already exceeded the depth limit. This replaces the limit check (Depth == 6) with a (Depth >= 6) to make sure that we don't circumvent it. This causes a couple of regressions as a mixture of calls (SimplifyMultipleUseDemandedBits + combineX86ShufflesRecursively) were calling with depths that were already over the limit. I've fixed SimplifyMultipleUseDemandedBits to not do this. combineX86ShufflesRecursively is trickier as we get a lot of regressions if we reduce its own limit from 8 to 6 (it also starts at Depth == 1 instead of Depth == 0 like the others....) - I'll see what I can do in future patches. llvm-svn: 367171
* [TargetLowering] Add depth limit to SimplifyMultipleUseDemandedBitsSimon Pilgrim2019-07-271-0/+3
| | | | | | We're getting reports of massive compile time increases because SimplifyMultipleUseDemandedBits was losing track of the depth and not earlying-out. No repro yet, but consider this a pre-emptive commit. llvm-svn: 367169
* [AArch64][GlobalISel] Implement narrowing of G_SEXT.Amara Emerson2019-07-261-0/+20
| | | | | | | | We need this to narrow a sext to s128. Differential Revision: https://reviews.llvm.org/D65357 llvm-svn: 367164
* [PowerPC][AIX]Add lowering of MCSymbol MachineOperand.Sean Fertile2019-07-261-0/+3
| | | | | | | | | | | Adds machine operand lowering for MCSymbolSDNodes to the PowerPC backend. This is needed to produce call instructions in assembly for AIX because the callee operand is a MCSymbolSDNode. The test is XFAIL'ed for asserts due to a (valid) assertion in PEI that the AIX ABI isn't supported yet. Differential Revision: https://reviews.llvm.org/D63738 llvm-svn: 367133
* Revert r367091, it caused PR42777.Nico Weber2019-07-261-59/+2
| | | | llvm-svn: 367118
* [SelectionDAG] GetDemandedBits - update SIGN_EXTEND_INREG op to just call ↵Simon Pilgrim2019-07-261-9/+1
| | | | | | SimplifyMultipleUseDemandedBits. llvm-svn: 367098
* [TargetLowering] SimplifyMultipleUseDemandedBits - add SIGN_EXTEND_INREG ↵Simon Pilgrim2019-07-261-0/+7
| | | | | | support. llvm-svn: 367096
* [SelectionDAG] GetDemandedBits - update OR/XOR ops to just call ↵Simon Pilgrim2019-07-261-6/+2
| | | | | | | | SimplifyMultipleUseDemandedBits. Eventually all of these will be moved over, but we create nodes in GetDemandedBits recursion at the moment which causes regressions when we try to remove them all. llvm-svn: 367092
* [TargetLowering] SimplifyMultipleUseDemandedBits - add BITCAST pass through ↵Simon Pilgrim2019-07-261-2/+59
| | | | | | | | support. This allows us to peek through BITCASTs and attempt simplify the source operand, and then bitcast back. llvm-svn: 367091
* Some case eror for: detected memory leaksKang Zhang2019-07-261-25/+0
| | | | llvm-svn: 367083
* [PowerPC] Do the Simple Early Return in block-placement pass to optimize the ↵Kang Zhang2019-07-261-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | blocks Summary: In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Below is an example ``` BB: | BB: XOR 3, 3, 4 | XOR 3, 3, 4 B TBB | B ChainBB ... | ... ChainBB: | ChainBB: B TBB | ADD 3, 3, 4 ... | BLR TBB: | ADD 3, 3, 4 | BLR | ``` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 367080
* Reland: [Remarks] Add support for serializing metadata for every remark streamerFrancis Visoiu Mistrih2019-07-261-48/+13
| | | | | | | | | | | | This allows every serializer format to implement metaSerializer() and return the corresponding meta serializer. Original llvm-svn: 366946 Reverted llvm-svn: 367004 This fixes the unit tests on Windows bots. llvm-svn: 367078
* [CodeGen] Don't resolve the stack protector frame accesses until PEIFrancis Visoiu Mistrih2019-07-251-0/+8
| | | | | | | | | | | | | | | | | | | | Currently, stack protector loads and stores are resolved during LocalStackSlotAllocation (if the pass needs to run). When this is the case, the base register assigned to the frame access is going to be one of the vregs created during LocalStackSlotAllocation. This means that we are keeping a pointer to the stack protector slot, and we're using this pointer to load and store to it. In case register pressure goes up, we may end up spilling this pointer to the stack, which can be a security concern. Instead, leave it to PEI to resolve the frame accesses. In order to do that, we make all stack protector accesses go through frame index operands, then PEI will resolve this using an offset from sp/fp/bp. Differential Revision: https://reviews.llvm.org/D64759 llvm-svn: 367068
* Revert rL366946 : [Remarks] Add support for serializing metadata for every ↵Simon Pilgrim2019-07-251-13/+48
| | | | | | | | | | | | | | remark streamer This allows every serializer format to implement metaSerializer() and return the corresponding meta serializer. ........ Fix windows build bots http://lab.llvm.org:8011/builders/llvm-clang-x86_64-win-fast http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win llvm-svn: 367004
* [Codegen] (X & (C l>>/<< Y)) ==/!= 0 --> ((X <</l>> Y) & C) ==/!= 0 foldRoman Lebedev2019-07-241-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This was originally reported in D62818. https://rise4fun.com/Alive/oPH InstCombine does the opposite fold, in hope that `C l>>/<< Y` expression will be hoisted out of a loop if `Y` is invariant and `X` is not. But as it is seen from the diffs here, if it didn't get hoisted, the produced assembly is almost universally worse. Much like with my recent "hoist add/sub by/from const" patches, we should get almost universal win if we hoist constant, there is almost always an "and/test by imm" instruction, but "shift of imm" not so much, so we may avoid having to materialize the immediate, and thus need one less register. And since we now shift not by constant, but by something else, the live-range of that something else may reduce. Special care needs to be applied not to disturb x86 `BT` / hexagon `tstbit` instruction pattern. And to not get into endless combine loop. Reviewers: RKSimon, efriedma, t.p.northover, craig.topper, spatel, arsenm Reviewed By: spatel Subscribers: hiraditya, MaskRay, wuzish, xbolva00, nikic, nemanjai, jvesely, wdng, nhaehnle, javed.absar, tpr, kristof.beyls, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62871 llvm-svn: 366955
* [GlobalISel] Support for inlining memcpy, memset and memmove calls.Amara Emerson2019-07-241-0/+505
| | | | | | | | | | | | | This introduces a new family of combiner helper routines that re-use the target specific cost model from SelectionDAG, and generate inline implementations of the memcpy family of intrinsics. The combines are only enabled at optimization levels higher than -O0, and give very substantial performance improvements. Differential Revision: https://reviews.llvm.org/D65167 llvm-svn: 366951
* [Remarks] Add support for serializing metadata for every remark streamerFrancis Visoiu Mistrih2019-07-241-48/+13
| | | | | | | This allows every serializer format to implement metaSerializer() and return the corresponding meta serializer. llvm-svn: 366946
* [AArch64][GlobalISel] Fix a crash during s128 G_ICMP legalization due to ↵Amara Emerson2019-07-241-4/+4
| | | | | | | | | | | r366317. r366317 added a legalization for s128 G_ICMP narrow scalar which tried to hard code the result type of the new legalized G_SELECT. Change this to instead use type of the original G_ICMP result and allow the target to legalize it if necessary later. llvm-svn: 366943
* [Remarks][NFC] Rename remarks::Serializer to remarks::RemarkSerializerFrancis Visoiu Mistrih2019-07-241-3/+3
| | | | llvm-svn: 366939
* Fix signed/unsigned comparison warning. NFCI.Simon Pilgrim2019-07-241-1/+1
| | | | llvm-svn: 366935
* [DAGCombine] matchBinOpReduction - add partial reduction matchingSimon Pilgrim2019-07-241-7/+32
| | | | | | | | | | | | | | | | | | | | This patch adds support for recognizing cases where a larger vector type is being used to reduce just the elements in the lower subvector: e.g. <8 x i32> reduction pattern in a <16 x i32> vector: <4,5,6,7,u,u,u,u,u,u,u,u,u,u,u,u> <2,3,u,u,u,u,u,u,u,u,u,u,u,u,u,u> <1,u,u,u,u,u,u,u,u,u,u,u,u,u,u,u> matchBinOpReduction returns the lower extracted subvector in such cases, assuming isExtractSubvectorCheap accepts the extraction. I've only enabled it for X86 reduction sums so far. I intend to enable it for the bitop/minmax cases in future patches, and eventually I think its worth turning it on all the time. This is mainly just a case of ensuring calls to matchBinOpReduction don't make assumptions on the vector width based on the original vector extraction. Fixes the x86 partial reduction sum cases in PR33758 and PR42023. Differential Revision: https://reviews.llvm.org/D65047 llvm-svn: 366933
* [SelectionDAG] makeEquivalentMemoryOrdering - early out for equal chains ↵Simon Pilgrim2019-07-241-1/+1
| | | | | | | | | | (PR42727) If we are already using the same chain for the old/new memory ops then just return. Fixes PR42727 which had getLoad() reusing an existing node. llvm-svn: 366922
* [SDAG] convert (sub x, 1) to (add x, -1) in ctpop expansion; NFCSanjay Patel2019-07-241-3/+3
| | | | | | We canonicalize to the add form, so create that directly for efficiency. llvm-svn: 366914
* [TargetLowering] SimplifyMultipleUseDemandedBits - add VECTOR_SHUFFLE support.Simon Pilgrim2019-07-231-0/+23
| | | | | | | | If all the demanded elts are from one operand and are inline, then we can use the operand directly. The changes are mainly from SSE41 targets which has blendvpd but not cmpgtq, allowing the v2i64 comparison to be simplified as we only need the signbit from alternate v4i32 elements. llvm-svn: 366817
* [TargetLowering] Add SimplifyMultipleUseDemandedBitsSimon Pilgrim2019-07-231-1/+128
| | | | | | | | | | | | | | | | | | This patch introduces the DAG version of SimplifyMultipleUseDemandedBits, which attempts to peek through ops (mainly and/or/xor so far) that don't contribute to the demandedbits/elts of a node - which means we can do this even in cases where we have multiple uses of an op, which normally requires us to demanded all bits/elts. The intention is to remove a similar instruction - SelectionDAG::GetDemandedBits - once SimplifyMultipleUseDemandedBits has matured. The InstCombine version of SimplifyMultipleUseDemandedBits can constant fold which I haven't added here yet, and so far I've only wired this up to some basic binops (and/or/xor/add/sub/mul) to demonstrate its use. We do see a couple of regressions that need to be addressed: AMDGPU unsigned dot product codegen retains an AND mask (for ZERO_EXTEND) that it previously removed (but otherwise the dotproduct codegen is a lot better). X86/AVX2 has poor handling of vector ANY_EXTEND/ANY_EXTEND_VECTOR_INREG - it prematurely gets converted to ZERO_EXTEND_VECTOR_INREG. The code owners have confirmed its ok for these cases to fixed up in future patches. Differential Revision: https://reviews.llvm.org/D63281 llvm-svn: 366799
* [DAGCombiner] Make ShrinkLoadReplaceStoreWithStore return an SDValue instead ↵Craig Topper2019-07-231-9/+8
| | | | | | | | | | of an SDNode*. NFCI The function was calling getNode() on an SDValue to return and the caller turned the result back into a SDValue. So just return the original SDValue to avoid this. llvm-svn: 366779
* [DAGCombiner] Use SDNode::isOperandOf to simplify some code. NFCICraig Topper2019-07-231-7/+1
| | | | llvm-svn: 366778
* Move variable out from debug only section.Richard Trieu2019-07-231-2/+0
| | | | | | | MFI is no longer just needed for an assert. Move it out of the debug only section to allow non-assert builds to be able to find it. llvm-svn: 366773
* [Statepoints] Fix a bug in statepoint lowering for functions w/no-realign-stackPhilip Reames2019-07-221-1/+8
| | | | | | | | | | We were silently using the ABI alignment for all of the stores generated for deopt and gc values. We'd gotten the alignment of the stack slot itself properly reduced (via MachineFrameInfo's clamping), but having the MMO on the store incorrect was enough for us to generate an aligned store to a unaligned location. The simplest fix would have been to just pass the alignment to the helper function, but once we do that, the helper function doesn't really help. So, inline it and directly call the MMO version of DAG.getStore with a properly constructed MMO. Note that there's a separate performance possibility here. Even if we *can* realign stacks, we probably don't *want to* if all of the stores are in slowpaths. But that's a later patch, if at all. :) llvm-svn: 366765
* Stubs out TLOF for AIX and add support for common vars in assembly output.Sean Fertile2019-07-221-0/+55
| | | | | | | | | Stubs out a TargetLoweringObjectFileXCOFF class, implementing only SelectSectionForGlobal for common symbols. Also adds an override of EmitGlobalVariable in PPCAIXAsmPrinter which adds a number of defensive errors and adds support for emitting common globals. llvm-svn: 366727
* TableGen: Support physical register inputs > 255Matt Arsenault2019-07-221-1/+4
| | | | | | | This was truncating register value that didn't fit in unsigned char. Switch AMDGPU sendmsg intrinsics to using a tablegen pattern. llvm-svn: 366695
* Added address-space mangling for stack related intrinsicsChristudasan Devadasan2019-07-222-3/+6
| | | | | | | | | | | | Modified the following 3 intrinsics: int_addressofreturnaddress, int_frameaddress & int_sponentry. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D64561 llvm-svn: 366679
* [IPRA][ARM] Make use of the "returned" parameter attributeOliver Stannard2019-07-221-0/+2
| | | | | | | | | | | | ARM has code to recognise uses of the "returned" function parameter attribute which guarantee that the value passed to the function in r0 will be returned in r0 unmodified. IPRA replaces the regmask on call instructions, so needs to be told about this to avoid reverting the optimisation. Differential revision: https://reviews.llvm.org/D64986 llvm-svn: 366669
* [GISel]: Attach missing range metadata while translating G_LOADsAditya Nandakumar2019-07-211-2/+3
| | | | | | | | | | https://reviews.llvm.org/D65048 Attach range information to G_LOAD when only defining one register. reviewed by: arsenm llvm-svn: 366656
* [Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvementsRoman Lebedev2019-07-201-35/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Four things here: 1. Generalize the fold to handle non-splat divisors. Reasonably trivial. 2. Unban power-of-two divisors. I don't see any reason why they should be illegal. * There is no ban in Hacker's Delight * I think the ban came from the same bug that caused the miscompile in the base patch - in `floor((2^W - 1) / D)` we were dividing by `D0` instead of `D`, and we **were** ensuring that `D0` is not `1`, which made sense. 3. Unban `1` divisors. I no longer believe Hacker's Delight actually says that the fold is invalid for `D = 0`. Further considerations: * We know that * `(X u% 1) == 0` can be constant-folded to `1`, * `(X u% 1) != 0` can be constant-folded to `0`, * Also, we know that * `X u<= -1` can be constant-folded to `1`, * `X u> -1` can be constant-folded to `0`, * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p * We know will end up with the following: `(setule/setugt (rotr (mul N, P), K), Q)` * Therefore, for given new DAG nodes and comparison predicates (`ule`/`ugt`), we will still produce the correct answer if: `Q` is a all-ones constant; and both `P` and `K` are *anything* other than `undef`. * The fold will indeed produce `Q = all-ones`. 4. Try to re-splat the `P` and `K` vectors - we don't care about their values for the lanes where divisor was `1`. Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00 Reviewed By: RKSimon Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63963 llvm-svn: 366637
* LiveIntervals: Fix handleMove asserting on BUNDLEMatt Arsenault2019-07-191-1/+4
| | | | | | | | | The top-level BUNDLE instruction should behave as an ordinary instruction. It is supposed to have all relevant registers as implicit operands. Moving it should work as any other instruction. I believe the assert intended to avoid moving instructions inside bundles. llvm-svn: 366605
* Revert "Use the MachineBasicBlock symbol for a callbr target"Nick Desaulniers2019-07-191-7/+2
| | | | | | | | | | | This reverts commit r366523/ccbffefccaff42b0d094c9ef0f49fc3e8c8456ea. Two regressions were immediately reported: - https://github.com/ClangBuiltLinux/linux/issues/614 - https://github.com/ClangBuiltLinux/linux/issues/615 Reported-by: nathanchance llvm-svn: 366600
* DAG: Handle dbg_value for arguments split into multiple subregsMatt Arsenault2019-07-191-23/+52
| | | | | | | | This was handled previously for arguments split due to not fitting in an MVT. This was dropping the register for argument registers split due to TLI::getRegisterTypeForCallingConv. llvm-svn: 366574
* [MachineCSE][MachinePRE] Avoid hoisting code from code regions into hot BBs.Kai Luo2019-07-191-0/+25
| | | | | | | | | | | | Summary: Current PRE hoists common computations into CMBB = DT->findNearestCommonDominator(MBB, MBB1). However, if CMBB is in a hot loop body, we might get performance degradation. Differential Revision: https://reviews.llvm.org/D64394 llvm-svn: 366570
* [IPRA] Don't rely on non-exact function definitionsOliver Stannard2019-07-191-1/+5
| | | | | | | | | If a function definition is not exact, then the linker could select a differently-compiled version of it, which could use different registers. https://reviews.llvm.org/D64909 llvm-svn: 366557
* Use the MachineBasicBlock symbol for a callbr targetBill Wendling2019-07-191-2/+7
| | | | | | | | | | | | | | | | | | | Summary: Inline asm doesn't use labels when compiled as an object file. Therefore, we shouldn't create one for the (potential) callbr destination. Instead, use the symbol for the MachineBasicBlock. Reviewers: nickdesaulniers, craig.topper Reviewed By: nickdesaulniers Subscribers: xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64888 llvm-svn: 366523
* [GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs ↵Amara Emerson2019-07-192-42/+83
| | | | | | | | | | | | | | and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516
* CodeGen: Allow !associated metadata to point to aliases.Peter Collingbourne2019-07-181-2/+2
| | | | | | | | | | This is a small extension of !associated, mostly useful for the implementation convenience of instrumentation passes that RAUW globals with aliases, such as LowerTypeTests. Differential Revision: https://reviews.llvm.org/D64951 llvm-svn: 366502
* [COFF] Change a variable type to be const in the HeapAllocSite map.Amy Huang2019-07-184-5/+7
| | | | llvm-svn: 366479
* [DAGCombine] Pull getSubVectorSrc helper out of ↵Simon Pilgrim2019-07-181-22/+22
| | | | | | | | narrowInsertExtractVectorBinOp. NFCI. NFC step towards reusing this in other EXTRACT_SUBVECTOR combines. llvm-svn: 366435
* Changes to display code view debug info type records in hex formatNilanjana Basu2019-07-171-1/+1
| | | | llvm-svn: 366390
OpenPOWER on IntegriCloud