summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [ScalarizeMaskedMemIntrin] Remove some temporary variables that are only ↵Craig Topper2018-09-271-14/+5
| | | | | | used by a single if condition. llvm-svn: 343268
* [ScalarizeMaskedMemIntrin] Cleanup comments. NFCCraig Topper2018-09-271-58/+49
| | | | llvm-svn: 343267
* [ScalarizeMaskedMemIntrin] Don't emit 'icmp eq i1 %x, 1' to check mask ↵Craig Topper2018-09-271-23/+9
| | | | | | | | values. That's just %x so use that directly. Had we emitted this IR earlier, InstCombine would have removed icmp so I'm going to assume using the i1 directly would be considered canonical. llvm-svn: 343244
* Revert r343192 as an ubsan build is currently failingLuke Cheeseman2018-09-276-15/+0
| | | | llvm-svn: 343235
* Reapply changes reverted in r343114, lldb patch to follow shortlyLuke Cheeseman2018-09-276-0/+15
| | | | llvm-svn: 343192
* Revert r342942 "[MachineCopyPropagation] Reimplement CopyTracker in terms of ↵Hans Wennborg2018-09-271-58/+54
| | | | | | | | | | | | | | | | register units" It seems to have broken several targets, see comments on the llvm-commits thread. > Change the copy tracker to keep a single map of register units instead > of 3 maps of registers. This gives a very significant compile time > performance improvement to the pass. I measured a 30-40% decrease in > time spent in MCP on x86 and AArch64 and much more significant > improvements on out of tree targets with more registers. > > Differential Revision: https://reviews.llvm.org/D52374 llvm-svn: 343189
* llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)Fangrui Song2018-09-2721-64/+55
| | | | | | | | | | | | Summary: The convenience wrapper in STLExtras is available since rL342102. Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D52573 llvm-svn: 343163
* [DAG] SelectionDAGLegalize::ExpandLegalINT_TO_FP - use getFPExtendOrRound ↵Simon Pilgrim2018-09-261-11/+1
| | | | | | | | helper. NFCI. Handles SrcVT == DstVT as well. llvm-svn: 343121
* Revert r343112 as CallFrameString API change has broken lldb buildsLuke Cheeseman2018-09-266-15/+0
| | | | llvm-svn: 343114
* [AArch64] - Return address signing dwarf supportLuke Cheeseman2018-09-266-0/+15
| | | | | | - Reapply r343089 with a fix for DebugInfo/Sparc/gnu-window-save.ll llvm-svn: 343112
* [CodeGen] Always print register ties in MI::dump()Francis Visoiu Mistrih2018-09-261-1/+1
| | | | | | | | | It was the case when calling MO::dump(), but MI::dump() was still depending on hasComplexRegisterTies(). The MIR output is not affected. llvm-svn: 343107
* Revert r343089 "[AArch64] - Return address signing dwarf support"Hans Wennborg2018-09-266-15/+0
| | | | | | | | | | | | | | | | | | | This caused the DebugInfo/Sparc/gnu-window-save.ll test to fail. > Functions that have signed return addresses need additional dwarf support: > - After signing the LR, and before authenticating it, the LR register is in a > state the is unusable by a debugger or unwinder > - To account for this a new directive, .cfi_negate_ra_state, is added > - This directive says the signed state of the LR register has now changed, > i.e. unsigned -> signed or signed -> unsigned > - This directive has the same CFA code as the SPARC directive GNU_window_save > (0x2d), adding a macro to account for multiply defined codes > - This patch matches the gcc implementation of this support: > https://patchwork.ozlabs.org/patch/800271/ > > Differential Revision: https://reviews.llvm.org/D50136 llvm-svn: 343103
* [DAG] ExpandLegalINT_TO_FP - pull out repeated getValueType() call. NFCI.Simon Pilgrim2018-09-261-9/+9
| | | | llvm-svn: 343101
* [CodeGen] Enable tail calls for functions with NonNull attributes.David Green2018-09-263-12/+7
| | | | | | | | | | | Adding NonNull as attributes to returned pointers has the unfortunate side effect of disabling tail calls. This patch ignores the NonNull attribute when we decide whether to tail merge, in the same way that we ignore the NoAlias attribute, as it has no affect on the call sequence. Differential Revision: https://reviews.llvm.org/D52238 llvm-svn: 343091
* Fixes removal of dead elements from PressureDiff (PR37252).Yury Gribov2018-09-261-2/+1
| | | | | | | | Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D51495 llvm-svn: 343090
* [AArch64] - Return address signing dwarf supportLuke Cheeseman2018-09-266-0/+15
| | | | | | | | | | | | | | | | | Functions that have signed return addresses need additional dwarf support: - After signing the LR, and before authenticating it, the LR register is in a state the is unusable by a debugger or unwinder - To account for this a new directive, .cfi_negate_ra_state, is added - This directive says the signed state of the LR register has now changed, i.e. unsigned -> signed or signed -> unsigned - This directive has the same CFA code as the SPARC directive GNU_window_save (0x2d), adding a macro to account for multiply defined codes - This patch matches the gcc implementation of this support: https://patchwork.ozlabs.org/patch/800271/ Differential Revision: https://reviews.llvm.org/D50136 llvm-svn: 343089
* Run VerifyDAGDiverence in debug onlyMikael Nilsson2018-09-262-0/+16
| | | | | | | | | VerifyDAGDiverence costs compilation time, avoid running it in non-debug builds. Differential Revision: https://reviews.llvm.org/D52454 llvm-svn: 343086
* Silence compiler warning about unused variable introduced in r343018Mikael Holmen2018-09-261-1/+1
| | | | | | | | | Since the body of the "else if" contains // TODO I suppose someone will need the variable again at some point, but with -Werror the warning made it not compile at all. llvm-svn: 343071
* [DebugInfo] Do not generate address info for removed debug labels.Hsiangkai Wang2018-09-261-4/+3
| | | | | | | | | | | | | | | In some senario, LLVM will remove llvm.dbg.labels in IR. For example, when the labels are in unreachable blocks, these labels will not be generated in LLVM IR. In the case, these debug labels will have address zero as their address. It is not legal address for debugger to set breakpoints or query sources. So, the patch inhibits the address info (DW_AT_low_pc) of removed labels. Fix build failed in BuildBot, clang-stage1-cmake-RA-incremental, on macOS. Differential Revision: https://reviews.llvm.org/D51908 llvm-svn: 343062
* [DAGCombiner] Remove unnecessary check for visitSDIVLike/visitUDIVLike ↵Craig Topper2018-09-251-2/+1
| | | | | | | | returning a UDIVREM or SDIVREM node. This shouldn't be possible and is a leftover from when we used to recursively call combine here. llvm-svn: 343049
* Unify landing pad information adding routines (NFC)Heejin Ahn2018-09-253-34/+38
| | | | | | | | | | | | | | | | | Summary: We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`, both of which add landing pad information to populate `LandingPadInfo` but are called from different locations, which was confusing. This patch unifies them with one `MachineFunction::addLandingPad` function, which now has functionlities of both functions. Reviewers: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52428 llvm-svn: 343018
* [x86] avoid 256-bit andnp that requires insert/extract with AVX1 (PR37449)Sanjay Patel2018-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the final (I hope!) problem pattern mentioned in PR37749: https://bugs.llvm.org/show_bug.cgi?id=37749 We are trying to avoid an AVX1 sinkhole caused by having 256-bit bitwise logic ops but no other 256-bit integer ops. We've already solved the simple logic ops, but 'andn' is an x86 special. I looked at alternative solutions like extending the generic DAG combine or trying to wait until the ANDNP node is created, but those are bigger patches that can over-reach. Ie, splitting to 128-bit does not look like a win in most cases with >1 256-bit op. The pattern matching is cluttered with bitcasts because of our i64 element canonicalization. For the affected test, we have this vector-type-legalized sequence: t29: v8i32 = concat_vectors t27, t28 t30: v4i64 = bitcast t29 t18: v8i32 = BUILD_VECTOR Constant:i32<-1>, Constant:i32<-1>, ... t31: v4i64 = bitcast t18 t32: v4i64 = xor t30, t31 t9: v8i32 = BUILD_VECTOR Constant:i32<255>, Constant:i32<255>, ... t34: v4i64 = bitcast t9 t35: v4i64 = and t32, t34 t36: v8i32 = bitcast t35 t37: v4i32 = extract_subvector t36, Constant:i64<0> t38: v4i32 = extract_subvector t36, Constant:i64<4> Differential Revision: https://reviews.llvm.org/D52318 llvm-svn: 343008
* [RegAllocGreedy] avoid using physreg candidates that cannot be correctly spilledDaniil Fukalov2018-09-252-9/+49
| | | | | | | | | | | | | | | | | | | For the AMDGPU target if a MBB contains exec mask restore preamble, SplitEditor may get state when it cannot insert a spill instruction. E.g. for a MIR bb.100: %1 = S_OR_SAVEEXEC_B64 %2, implicit-def $exec, implicit-def $scc, implicit $exec and if the regalloc will try to allocate a virtreg to the physreg already assigned to virtreg %1, it should insert spill instruction before the S_OR_SAVEEXEC_B64 instruction. But it is not possible since can generate incorrect code in terms of exec mask. The change makes regalloc to ignore such physreg candidates. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D52052 llvm-svn: 343004
* Revert "[DebugInfo] Do not generate address info for removed debug labels."Justin Bogner2018-09-251-3/+4
| | | | | | | | | | The added test is failing on macOS: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/53550/ This reverts r342943. llvm-svn: 342993
* [LegalizeDAG] Prune Predecessor check in ↵Nirav Dave2018-09-251-0/+1
| | | | | | ExpandExtractFromVectorThroughStack. NFCI. llvm-svn: 342985
* [DAGCombine] Improve Predecessor check in SimplifySelectOps. NFCI.Nirav Dave2018-09-251-4/+36
| | | | | | | Reuse search space bookkeeping across multiple predecessor checks qdone to avoid redundancy. This should cut search cost by ~4x. llvm-svn: 342984
* [DAGCombine] Share predecessor bookkeeping in CombineToPostIndexedLoadStore. ↵Nirav Dave2018-09-251-2/+9
| | | | | | NFCI. llvm-svn: 342983
* [DAGCombine] Don't fold dependent loads across SELECT_CC.Nirav Dave2018-09-251-4/+5
| | | | | | | | | | | | | | | | | | DAGCombine will try to fold two loads that feed a SELECT or SELECT_CC after the select, resulting in a select of an address and a single load after. If either of the loads depend on the other, this is not legal as it could introduce cycles. However, it only checked this if the opcode was a SELECT, and not for a SELECT_CC. Unfortunately, the only reproducer I have for this is for our downstream target. I've tried getting it to trigger on an upstream one but haven't been successful. Patch thanks to Bevin Hansson. llvm-svn: 342980
* Use unique_ptr to hold AsmInfo,MRI,MII,STIFangrui Song2018-09-251-5/+5
| | | | | | | | | | | | Reviewers: pcc, dblaikie Reviewed By: dblaikie Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52389 llvm-svn: 342945
* Use TRI->regsOverlap() in MachineBasicBlock::computeRegisterLivenessMikael Holmen2018-09-251-6/+4
| | | | | | | | | | | | | | | | | | | Summary: For the loop that used MCRegAliasIterator this should be NFC. For the loop that previously used MCSubRegIterator we should now detect more cases where the register is actually live out that we previously missed. Reviewers: MatzeB, arsenm Reviewed By: MatzeB Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D52410 llvm-svn: 342944
* [DebugInfo] Do not generate address info for removed debug labels.Hsiangkai Wang2018-09-251-4/+3
| | | | | | | | | | | | | In some senario, LLVM will remove llvm.dbg.labels in IR. For example, when the labels are in unreachable blocks, these labels will not be generated in LLVM IR. In the case, these debug labels will have address zero as their address. It is not legal address for debugger to set breakpoints or query sources. So, the patch inhibits the address info (DW_AT_low_pc) of removed labels. Differential Revision: https://reviews.llvm.org/D51908 llvm-svn: 342943
* [MachineCopyPropagation] Reimplement CopyTracker in terms of register unitsJustin Bogner2018-09-251-54/+58
| | | | | | | | | | | | Change the copy tracker to keep a single map of register units instead of 3 maps of registers. This gives a very significant compile time performance improvement to the pass. I measured a 30-40% decrease in time spent in MCP on x86 and AArch64 and much more significant improvements on out of tree targets with more registers. Differential Revision: https://reviews.llvm.org/D52374 llvm-svn: 342942
* [MachineCopyPropagation] Rework how we manage RegMask clobbersJustin Bogner2018-09-251-35/+23
| | | | | | | | | | | | | | | | Instead of updating the CopyTracker's maps each time we come across a RegMask, defer checking for this kind of interference until we're actually trying to propagate a copy. This avoids the need to repeatedly iterate over maps in the cases where we don't end up doing any work. This is a slight compile time improvement for MachineCopyPropagation as is, but it also enables a much bigger improvement that I'll follow up with soon. Differential Revision: https://reviews.llvm.org/D52370 llvm-svn: 342940
* [New PM][PassInstrumentation] IR printing support for New Pass ManagerFedor Sergeev2018-09-241-0/+1
| | | | | | | | | | | | | | | | Implementing -print-before-all/-print-after-all/-filter-print-func support through PassInstrumentation callbacks. - PrintIR routines implement printing callbacks. - StandardInstrumentations class provides a central place to manage all the "standard" in-tree pass instrumentations. Currently it registers PrintIR callbacks. Reviewers: chandlerc, paquette, philip.pfaffe Differential Revision: https://reviews.llvm.org/D50923 llvm-svn: 342896
* [DAGCombiner] use UADDO to optimize saturated unsigned addSanjay Patel2018-09-241-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a preliminary step towards solving PR14613: https://bugs.llvm.org/show_bug.cgi?id=14613 If we have an 'add' instruction that sets flags, we can use that to eliminate an explicit compare instruction or some other instruction (cmn) that sets flags for use in the later select. As shown in the unchanged tests that use 'icmp ugt %x, %a', we're effectively reversing an IR icmp canonicalization that replaces a variable operand with a constant: https://rise4fun.com/Alive/V1Q But we're not using 'uaddo' in those cases via DAG transforms. This happens in CGP after D8889 without checking target lowering to see if the op is supported. So AArch already shows 'uaddo' codegen for the i8/i16/i32/i64 test variants with "using_cmp_sum" in the title. That's the pattern that CGP matches as an unsigned saturated add and converts to uaddo without checking target capabilities. This patch is gated by isOperationLegalOrCustom(ISD::UADDO, VT), so we see only see AArch diffs for i32/i64 in the tests with "using_cmp_notval" in the title (unlike x86 which sees improvements for all sizes because all sizes are 'custom'). But the AArch code (like x86) looks better when translated to 'uaddo' in all cases. So someone that is involved with AArch may want to set i8/i16 to 'custom' for UADDO, so this patch will fire on those tests. Another possibility given the existing behavior: we could remove the legal-or-custom check altogether because we're assuming that a UADDO sequence is canonical/optimal before we ever reach here. But that seems like a bug to me. If the target doesn't have an add-with-flags op, then it's not likely that we'll get optimal DAG combining using a UADDO node. This is similar justification for why we don't canonicalize IR to the overflow math intrinsic sibling (llvm.uadd.with.overflow) for UADDO in the first place. Differential Revision: https://reviews.llvm.org/D51929 llvm-svn: 342886
* Remove debug printf leftover from r342397Hans Wennborg2018-09-241-2/+0
| | | | llvm-svn: 342863
* [DAGCombiner] Remove some dead code from ConstantFoldBITCASTofBUILD_VECTORCraig Topper2018-09-241-9/+2
| | | | | | This code handled SCALAR_TO_VECTOR being returned by the recursion, but the code that used to return SCALAR_TO_VECTOR was removed in 2015. llvm-svn: 342856
* [DAGCombiner] Clarify a comment. NFCCraig Topper2018-09-231-2/+4
| | | | | | This comment was misleading about why we were restricting to before legalize types. The reason given would only apply to before legalize ops. But there is a before legalize types reason that should also be listed. llvm-svn: 342851
* [LegalizeTypes] Fix bad indentation. NFCCraig Topper2018-09-231-1/+1
| | | | llvm-svn: 342850
* [DAGCombiner][x86] extend decompose of integer multiply into shift/add with ↵Sanjay Patel2018-09-231-6/+13
| | | | | | | | | | | | | | | | | negation This is an alternative to https://reviews.llvm.org/D37896. We can't decompose multiplies generically without a target hook to tell us when it's profitable. ARM and AArch64 may be able to remove some existing code that overlaps with this transform. This extends D52195 and may resolve PR34474: https://bugs.llvm.org/show_bug.cgi?id=34474 (still an open question about transforming legal vector multiplies, but we could open another bug report for those) llvm-svn: 342844
* [DAGCombiner] Simplify some code in visitBITCAST. NFCICraig Topper2018-09-221-9/+3
| | | | llvm-svn: 342826
* [DAGCombiner] Rewrite r331896 in a different way to address a FIXME. NFCICraig Topper2018-09-221-11/+14
| | | | llvm-svn: 342809
* [MachineCopyPropagation] Refactor copy tracking into a class. NFCJustin Bogner2018-09-211-99/+133
| | | | | | | | This is a bit easier to follow than handling the copy and src maps directly in the pass, and will make upcoming changes to how this is done easier to follow. llvm-svn: 342703
* [MachineCopyPropagation] Minor clang-formatting. NFCJustin Bogner2018-09-211-37/+37
| | | | llvm-svn: 342700
* Add the ability to register callbacks for removal and insertion of MachineInstrsAditya Nandakumar2018-09-202-1/+17
| | | | | | | | | https://reviews.llvm.org/D52127 This patch adds the ability to watch for insertions/deletions of MachineInstructions similar to MachineRegisterInfo. llvm-svn: 342696
* [MachineOutliner][NFC] Don't add MBBs with a size < 2 to the search spaceJessica Paquette2018-09-201-1/+5
| | | | | | | | | | | The suffix tree won't ever consider sequences with a length less than 2. Therefore, we really ought to not even consider them in the first place. Also add a FIXME explaining that this should be defined in terms of the size in B of an outlined call versus the size in B of the MBB. llvm-svn: 342688
* [RegAllocGreedy] Fix crash in tryLocalSplitWalter Lee2018-09-201-1/+5
| | | | | | | | | | tryLocalSplit only handles a single use block, but an interval may have multiple use blocks. So don't crash in that case. This fixes PR38795. Differential revision: https://reviews.llvm.org/D52277 llvm-svn: 342682
* [MachineOutliner][NFC] Move debug info emission to createOutlinedFunctionJessica Paquette2018-09-201-35/+23
| | | | | | | | | When you create an outlined function, you know everything you need to know to decide if debug info should be created. If we emit debug info in createOutlinedFunction, then we don't need to keep track of every IR function we create. llvm-svn: 342677
* [SelectionDAG] replace duplicated peekThroughBitcast helper functions; NFCISanjay Patel2018-09-202-38/+31
| | | | | | | | | | | | | | x86 had 2 versions of peekThroughBitcast. DAGCombiner had 1. Plus, it had a 1-off implementation for the one-use variant. Move the x86 versions of the code to SelectionDAG, so we don't have different copies of the code. No functional change intended. I'm putting this next to isBitwiseNot() because I am planning to use it in there. Another option is next to the helpers in the ISD namespace (eg, ISD::isConstantSplatVector()). But if there's no good reason for those to be there, I'd prefer to pull other helpers over to SelectionDAG in follow-up steps. Differential Revision: https://reviews.llvm.org/D52285 llvm-svn: 342669
* [DWARF] - Emit the correct value for DW_AT_addr_base.George Rimar2018-09-205-10/+30
| | | | | | | | | | | | | | Currently, we emit DW_AT_addr_base that points to the beginning of the .debug_addr section. That is not correct for the DWARF5 case because address table contains the header and the attribute should point to the first entry following the header. This is currently the reason why LLDB does not work with such executables correctly. Patch fixes the issue. Differential revision: https://reviews.llvm.org/D52168 llvm-svn: 342635
OpenPOWER on IntegriCloud