summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [LegalizeVectorTypes][X86][ARM][AArch64][PowerPC] Don't use ↵Craig Topper2018-11-261-7/+2
| | | | | | | | | | | | SplitVecOp_TruncateHelper for FP_TO_SINT/UINT. SplitVecOp_TruncateHelper tries to promote the result type while splitting FP_TO_SINT/UINT. It then concatenates the result and introduces a truncate to the original result type. But it does this without inserting the AssertZExt/AssertSExt that the regular result type promotion would insert. Nor does it turn FP_TO_UINT into FP_TO_SINT the way normal result type promotion for these operations does. This is bad on X86 which doesn't support FP_TO_SINT until AVX512. This patch disables the use of SplitVecOp_TruncateHelper for these operations and just lets normal promotion handle it. I've tweaked a couple things in X86ISelLowering to avoid a few obvious regressions there. I believe all the changes on X86 are improvements. The other targets look neutral. Differential Revision: https://reviews.llvm.org/D54906 llvm-svn: 347593
* [SelectionDAG] Teach BaseIndexOffset::match to unwrap the base after looking ↵Craig Topper2018-11-261-3/+3
| | | | | | | | | | through an add/or We might find a target specific node that needs to be unwrapped after we look through an add/or. Otherwise we get inconsistent results if one pointer is just X86WrapperRIP and the other is (add X86WrapperRIP, C) Differential Revision: https://reviews.llvm.org/D54818 llvm-svn: 347591
* [CodeGen] Support custom format of stack mapsThan McIntosh2018-11-261-5/+22
| | | | | | | | | | | | | | | | | | | | | | | Summary: Add a hook to the GCMetadataPrinter for emitting stack maps in custom format. The hook will be called at stack map generation time. The default stack map format is used if there is no hook. For this to be useful a few data structures and accessors are exposed from the StackMaps class, so the custom printer can access the stack map data. This patch authored by Cherry Zhang <cherryyz@google.com>. Reviewers: thanm, apilipenko, reames Reviewed By: reames Subscribers: reames, apilipenko, nemanjai, javed.absar, kbarton, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D53892 llvm-svn: 347584
* Delete dead code introduced in r347354.Erich Keane2018-11-261-4/+0
| | | | | | | | | ParentTy is never used other than an assignment, and since it is a pointer, there is no side effect. Some versions of GCC notice and warn on this. Change-Id: I37dc1a18c7b58040419afb803621de13d8904a8f llvm-svn: 347581
* [CodeGen] Take SPAdj into account for STATEPOINT liveness argsThan McIntosh2018-11-261-1/+1
| | | | | | | | | | | | | | | | | | Summary: STATEPOINT records its args' locations on stack relative to SP. If the SP is changed, take that into account. This patch authored by Cherry Zhang <cherryyz@google.com>. Reviewers: thanm, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D53603 llvm-svn: 347569
* Remove an unnecessary file; NFC.Aaron Ballman2018-11-262-18/+0
| | | | | | This source file has not been needed since r346522 and was triggering diagnostics in MSVC about an object file which exports no public symbols (LNK4221). llvm-svn: 347565
* [ARM GlobalISel] Support G_CTLZ and G_CTLZ_ZERO_UNDEFDiana Picus2018-11-261-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can now select CLZ via the TableGen'erated code, so support G_CTLZ and G_CTLZ_ZERO_UNDEF throughout the pipeline for types <= s32. Legalizer: If the CLZ instruction is available, use it for both G_CTLZ and G_CTLZ_ZERO_UNDEF. Otherwise, use a libcall for G_CTLZ_ZERO_UNDEF and lower G_CTLZ in terms of it. In order to achieve this we need to add support to the LegalizerHelper for the legalization of G_CTLZ_ZERO_UNDEF for s32 as a libcall (__clzsi2). We also need to allow lowering of G_CTLZ in terms of G_CTLZ_ZERO_UNDEF if that is supported as a libcall, as opposed to just if it is Legal or Custom. Due to a minor refactoring of the helper function in charge of this, we will also allow the same behaviour for G_CTTZ and G_CTPOP. This is not going to be a problem in practice since we don't yet have support for treating G_CTTZ and G_CTPOP as libcalls (not even in DAGISel). Reg bank select: Map G_CTLZ to GPR. G_CTLZ_ZERO_UNDEF should not make it to this point. Instruction select: Nothing to do. llvm-svn: 347545
* Fix typo in comment. NFCDiana Picus2018-11-261-1/+1
| | | | llvm-svn: 347544
* [x86] limit transform for select-of-fp-constantsSanjay Patel2018-11-251-0/+3
| | | | | | | | | | | | This should likely be adjusted to limit this transform further, but these diffs should be clear wins. If we have blendv/conditional move, then we should assume those are cheap ops. The loads become independent of the compare, so those can be speculated before we need to use the values in the blend/mov. llvm-svn: 347526
* [SelectionDAG] move constant or splat functions to common locationSanjay Patel2018-11-252-39/+28
| | | | | | | | rL347502 moved the null sibling, so we should group all of these together. I'm not sure why these aren't methods of the SDValue class itself, but that's another patch if that's possible. llvm-svn: 347523
* [DAG] consolidate shift simplificationsSanjay Patel2018-11-232-74/+58
| | | | | | | | | | ...and use them to avoid creating obviously undef values as discussed in the post-commit thread for r347478. The diffs in vector div/rem show that we were missing real optimizations by creating bogus shift nodes. llvm-svn: 347502
* Revert r347490 as it breaks address sanitizer buildsLuke Cheeseman2018-11-236-15/+0
| | | | llvm-svn: 347499
* Revert r343341Luke Cheeseman2018-11-236-0/+15
| | | | | | | - Cannot reproduce the build failure locally and the build logs have been deleted. llvm-svn: 347490
* [LegalizeVectorTypes] Don't use SplitVecOp_TruncateHelper if we're heading ↵Craig Topper2018-11-231-0/+9
| | | | | | | | | | | | towards scalarizing the type. This code takes a truncate, fp_to_int, or int_to_fp with a legal result type and an input type that needs to be split and enlarges the elements in the result type before doing the split. Then inserts a follow up truncate or fp_round after concatenating the two halves back together. But if the input type of the original op is being split on its way to ultimately being scalarized we're just going to end up building a vector from scalars and then truncating or rounding it in the vector register. Seems kind of silly to enlarge the result element type of the operation only to end up with scalar code and then building a vector with large elements only to make the elements smaller again in the vector register. Seems better to just try to get away producing smaller result types in the scalarized code. The X86 test case that changes is a pretty contrived test case that exists because of a bug we used to have in our AVG matching code. I think the code is better now, but its not realistic anyway. llvm-svn: 347482
* [LegalizeVectorTypes] Have SplitVecOp_TruncateHelper fall back to ↵Craig Topper2018-11-221-1/+7
| | | | | | | | | | SplitVecOp_UnaryOp if splitting the output type would be a legal type. SplitVecOp_TruncateHelper tries to introduce a multilevel truncate to avoid scalarization. But if splitting the result type would still be a legal type we don't need to do that. The comment block at the top of the function implied that this was already implemented. I looked back through the history and it doesn't look to have ever been checked. llvm-svn: 347479
* [DAGCombiner] form 'not' ops ahead of shifts (PR39657)Sanjay Patel2018-11-221-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We fail to canonicalize IR this way (prefer 'not' ops to arbitrary 'xor'), but that would not matter without this patch because DAGCombiner was reversing that transform. I think we need this transform in the backend regardless of what happens in IR to catch cases where the shift-xor is formed late from GEP or other ops. https://rise4fun.com/Alive/NC1 Name: shl Pre: (-1 << C2) == C1 %shl = shl i8 %x, C2 %r = xor i8 %shl, C1 => %not = xor i8 %x, -1 %r = shl i8 %not, C2 Name: shr Pre: (-1 u>> C2) == C1 %sh = lshr i8 %x, C2 %r = xor i8 %sh, C1 => %not = xor i8 %x, -1 %r = lshr i8 %not, C2 https://bugs.llvm.org/show_bug.cgi?id=39657 llvm-svn: 347478
* [mingw] Use unmangled name after the $ in the section nameReid Kleckner2018-11-211-2/+3
| | | | | | | | | | | | | | | GCC does it this way, and we have to be consistent. This includes stdcall and fastcall functions with suffixes. I confirmed that a fastcall function named "foo" ends up in ".text$foo", not ".text$@foo@8". Based on a patch by Andrew Yohn! Fixes PR39218. Differential Revision: https://reviews.llvm.org/D54762 llvm-svn: 347431
* [DAGCombiner] refactor select-of-FP-constants transformSanjay Patel2018-11-211-53/+60
| | | | | | | | | | | | | | This transform needs to be limited. We are converting to a constant pool load very early, and we are turning loads that are independent of the select condition (and therefore speculatable) into a dependent non-speculatable load. We may also be transferring a condition code from an FP register to integer to create that dependent load. llvm-svn: 347424
* [DAGCombiner] reduce code duplication; NFCSanjay Patel2018-11-211-33/+30
| | | | llvm-svn: 347410
* [TargetLowering] SimplifyDemandedBits - only reduce known bits for integer ↵Simon Pilgrim2018-11-211-1/+3
| | | | | | | | constants Avoids fuzzing crash found by Mikael Holmén. llvm-svn: 347393
* Test commit: Delete trailing space in commentNikita Popov2018-11-211-1/+1
| | | | llvm-svn: 347385
* [DAGCombiner] look through bitcasts when trying to narrow vector binopsSanjay Patel2018-11-201-13/+14
| | | | | | | | | | | | | | | | | | | This is another step in vector narrowing - a follow-up to D53784 (and hoping to eventually squash potential regressions seen in D51553). The x86 test diffs are wins, but the AArch64 diff is probably not. That problem already exists independent of this patch (see PR39722), but it went unnoticed in the previous patch because there were no regression tests that showed the possibility. The x86 diff in i64-mem-copy.ll is close. Given the frequency throttling concerns with using wider vector ops, an extra extract to reduce vector width is the right trade-off at this level of codegen. Differential Revision: https://reviews.llvm.org/D54392 llvm-svn: 347356
* [CodeView] Add support for ref-qualified member functions.Zachary Turner2018-11-202-21/+49
| | | | | | | | | | | | | | | | | | | | | | When you have a member function with a ref-qualifier, for example: struct Foo { void Func() &; void Func2() &&; }; clang-cl was not emitting this information. Doing so is a bit awkward, because it's not a property of the LF_MFUNCTION type, which is what you'd expect. Instead, it's a property of the this pointer which is actually an LF_POINTER. This record has an attributes bitmask on it, and our handling of this bitmask was all wrong. We had some parts of the bitmask defined incorrectly, but importantly for this bug, we didn't know about these extra 2 bits that represent the ref qualifier at all. Differential Revision: https://reviews.llvm.org/D54667 llvm-svn: 347354
* [CodeView] Mark this pointers as const.Zachary Turner2018-11-201-0/+3
| | | | | | | | | | | This is for compatibility with MSVC, which also marks this pointers as being const-qualified. Fixes llvm.org/pr36526 Differential Revision: https://reviews.llvm.org/D54736 llvm-svn: 347353
* [DAGCombine] Add calls to SimplifyDemandedVectorElts from ↵Simon Pilgrim2018-11-202-1/+5
| | | | | | | | visitINSERT_SUBVECTOR (PR37989) This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices. llvm-svn: 347313
* [TargetLowering] Improve SimplifyDemandedVectorElts/SimplifyDemandedBits supportSimon Pilgrim2018-11-201-0/+17
| | | | | | | | | | For bitcast nodes from larger element types, add the ability for SimplifyDemandedVectorElts to call SimplifyDemandedBits by merging the elts mask to a bits mask. I've raised https://bugs.llvm.org/show_bug.cgi?id=39689 to deal with the few places where SimplifyDemandedBits's lack of vector handling is a problem. Differential Revision: https://reviews.llvm.org/D54679 llvm-svn: 347301
* [SelectionDAG] Compute known bits and num sign bits for live out vector ↵Craig Topper2018-11-202-4/+4
| | | | | | | | | | | | | | | | | | | registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks Summary: We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply. This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362 Reviewers: spatel, efriedma, RKSimon, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D54725 llvm-svn: 347287
* [DAGCombiner] reduce code duplication in visitXOR; NFCSanjay Patel2018-11-201-32/+29
| | | | llvm-svn: 347278
* Implement computeKnownBits for scalar_to_vectorStanislav Mekhanoshin2018-11-191-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D54728 llvm-svn: 347274
* [DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI ↵Simon Pilgrim2018-11-191-8/+6
| | | | | | | | | | operands (PR21207) Consistently use (!LegalOperations || isOperationLegalOrCustom) for all node pairs. Differential Revision: https://reviews.llvm.org/D53478 llvm-svn: 347255
* Fix Wdocumentation warning. NFCI.Simon Pilgrim2018-11-191-1/+1
| | | | llvm-svn: 347253
* Fix unused function warning.Simon Pilgrim2018-11-191-6/+0
| | | | llvm-svn: 347252
* [TargetLowering] expandFP_TO_UINT - improve fp16 supportSimon Pilgrim2018-11-191-10/+18
| | | | | | | | | | As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode. I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary. Differential Revision: https://reviews.llvm.org/D54703 llvm-svn: 347251
* Add missing stream operator for Polynomial class to fix debug builds.Simon Pilgrim2018-11-191-0/+7
| | | | llvm-svn: 347249
* [InterleavedLoadCombine] Fix warningsMartin Elshuber2018-11-191-6/+1
| | | | | | | * remove unused function * fix compare llvm-svn: 347241
* [DebugInfo] DISubprogram flags get their own flags word. NFC.Paul Robinson2018-11-191-2/+3
| | | | | | | | | | | | | This will hold flags specific to subprograms. In the future we could potentially free up scarce bits in DIFlags by moving subprogram-specific flags from there to the new flags word. This patch does not change IR/bitcode formats, that will be done in a follow-up. Differential Revision: https://reviews.llvm.org/D54597 llvm-svn: 347239
* [InterleavedLoadCombine] Fix warning unused variableMartin Elshuber2018-11-191-2/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D52653 llvm-svn: 347229
* [SelectionDAG] simplify vector select with undef operand(s)Sanjay Patel2018-11-192-5/+12
| | | | llvm-svn: 347227
* [InterleavedLoadCombine] Remove unused include. NFC.Benjamin Kramer2018-11-191-1/+0
| | | | llvm-svn: 347226
* [SelectionDAG] simplify select FP with undef conditionSanjay Patel2018-11-191-1/+1
| | | | llvm-svn: 347212
* [SelectionDAG] add simplifySelect() to reduce code duplication; NFCSanjay Patel2018-11-192-36/+27
| | | | | | This should be extended to handle FP and vectors in follow-up patches. llvm-svn: 347210
* Subject: [PATCH] [CodeGen] Add pass to combine interleaved loads.Martin Elshuber2018-11-193-0/+1368
| | | | | | | | | | | | | | This patch defines an interleaved-load-combine pass. The pass searches for ShuffleVector instructions that represent interleaved loads. Matches are converted such that they will be captured by the InterleavedAccessPass. The pass extends LLVMs capabilities to use target specific instruction selection of interleaved load patterns (e.g.: ld4 on Aarch64 architectures). Differential Revision: https://reviews.llvm.org/D52653 llvm-svn: 347208
* Fix disturbing warning - NFCISerge Guelton2018-11-191-1/+1
| | | | llvm-svn: 347186
* [ProfileSummary] Standardize methods and fix commentVedant Kumar2018-11-191-1/+1
| | | | | | | | | | | | | | | | | | | | | Every Analysis pass has a get method that returns a reference of the Result of the Analysis, for example, BlockFrequencyInfo &BlockFrequencyInfoWrapperPass::getBFI(). I believe that ProfileSummaryInfo::getPSI() is the only exception to that, as it was returning a pointer. Another change is renaming isHotBB and isColdBB to isHotBlock and isColdBlock, respectively. Most methods use BB as the argument of variable names while methods usually refer to Basic Blocks as Blocks, instead of BB. For example, Function::getEntryBlock, Loop:getExitBlock, etc. I also fixed one of the comments. Patch by Rodrigo Caetano Rocha! Differential Revision: https://reviews.llvm.org/D54669 llvm-svn: 347182
* [DAG] add undef simplifications for select nodesSanjay Patel2018-11-182-14/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sadly, this duplicates (twice) the logic from InstSimplify. There might be some way to at least share the DAG versions of the code, but copying the folds seems to be the standard method to ensure that we don't miss these folds. Unlike in IR, we don't run DAGCombiner to fixpoint, so there's no way to ensure that we do these kinds of simplifications unless the code is repeated at node creation time and during combines. There were other tests that would become worthless with this improvement that I changed as pre-commits: rL347161 rL347164 rL347165 rL347166 rL347167 I'm not sure how to salvage the remaining tests (diffs in this patch). So the x86 tests verify that the new code is working as intended. The AMDGPU test is actually similar to my motivating case: we have some undef value that has survived to machine IR in an x86 test, and then it gets folded in some weird way, or we crash if we don't transfer the undef flag. But we would have been better off never getting to that point by doing these simplifications. This will lead back to PR32023 someday... https://bugs.llvm.org/show_bug.cgi?id=32023 llvm-svn: 347170
* [SelectionDAG] simplify code; NFCSanjay Patel2018-11-181-6/+5
| | | | llvm-svn: 347160
* Use llvm::copy. NFCFangrui Song2018-11-175-9/+7
| | | | llvm-svn: 347126
* DAG combiner: fold (select, C, X, undef) -> XStanislav Mekhanoshin2018-11-161-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D54646 llvm-svn: 347110
* [LegalizeVectorOps] After custom legalizing an extending load or a ↵Craig Topper2018-11-161-2/+10
| | | | | | | | | | truncating store, make sure the custom code is also legal. For example, on X86 we emit a sign_extend_vector_inreg from LowerLoad and without sse4.1 this node will need further legalization. Previously this sign_extend_vector_inreg was being custom lowered during DAG legalization instead of vector op legalization. Unfortunately, this doesn't seem to matter for the output of any existing lit tests. llvm-svn: 347094
* [codeview] Expose -gcodeview-ghash for global type hashingReid Kleckner2018-11-162-3/+8
| | | | | | | | | | | | | | | | | | | | | Summary: Experience has shown that the functionality is useful. It makes linking optimized clang with debug info for me a lot faster, 20s to 13s. The type merging phase of PDB writing goes from 10s to 3s. This removes the LLVM cl::opt and replaces it with a metadata flag. After this change, users can do the following to use ghash: - add -gcodeview-ghash to compiler flags - replace /DEBUG with /DEBUG:GHASH in linker flags Reviewers: zturner, hans, thakis, takuto.ikuta Subscribers: aprantl, hiraditya, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D54370 llvm-svn: 347072
OpenPOWER on IntegriCloud