summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] (insert_vector_elt (vector_shuffle X, Y), (extract_vector_elt ↵Amaury Sechet2019-08-291-4/+43
| | | | | | | | | | | | | | | | X, N), IdxC) -> (vector_shuffle X, Y) Summary: This is beneficial when the shuffle is only used once and end up being generated in a few places when some node is combined into a shuffle. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66718 llvm-svn: 370326
* LegalizeSetCCCondCode - Reduce scope of NeedSwap to fix cppcheck warning. NFCI.Simon Pilgrim2019-08-291-1/+1
| | | | | | No need for this to be defined outside the only switch case its used in. llvm-svn: 370320
* [X86] Make inline assembly 'x' and 'v' constraints work for f128.Craig Topper2019-08-291-2/+6
| | | | | | | | | Including a type legalizer fix to make bitcast operand promotion work correctly when getSoftenedFloat returns f128 instead of i128. Fixes PR43157 llvm-svn: 370293
* [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCallShiva Chen2019-08-282-8/+66
| | | | | | | | | | | | | | | | | | | | | | | | | The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult | (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275
* [FPEnv] Add fptosi and fptoui constrained intrinsics.Kevin P. Neal2019-08-2810-14/+141
| | | | | | | | | | | | | | | | | This implements constrained floating point intrinsics for FP to signed and unsigned integers. Quoting from D32319: The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed. Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton Approved by: Craig Topper Differential Revision: http://reviews.llvm.org/D63782 llvm-svn: 370228
* [AArch64][GlobalISel] Fall back when translating musttail callsJessica Paquette2019-08-281-0/+1
| | | | | | | | | | These are currently translated as normal functions calls in AArch64. Until we have proper tail call lowering, we shouldn't translate these. Differential Revision: https://reviews.llvm.org/D66842 llvm-svn: 370225
* [AMDGPU] Adjust number of SGPRs available in Calling ConventionRyan Taylor2019-08-281-14/+4
| | | | | | | | | This reduces the number of SGPRs due to some concerns about running out of SGPRs if you make all the SGPRs that aren't reserved available for the calling convention. Change-Id: Idb4ca4dc72f5b6808cb524ff7270915a8de5b4c1 llvm-svn: 370215
* [DAGCombine] Fix cppcheck shadow variable warning. NFCI.Simon Pilgrim2019-08-281-4/+4
| | | | | | We already have an outer Ops variable. llvm-svn: 370197
* [TargetLowering] Add buildLegalVectorShuffle facility to help build legal ↵Amaury Sechet2019-08-282-68/+66
| | | | | | | | | | | | | | | | shuffles Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66804 llvm-svn: 370190
* [DAGCombine] Remove LoadedSlice::Cost default 'ForCodeSize' constructor ↵Simon Pilgrim2019-08-281-3/+3
| | | | | | | | arguments. NFCI. These were always being passed in and it allowed me to add the explicit tag to stop a cppcheck warning about 1 argument constructors. llvm-svn: 370189
* [GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC.Amara Emerson2019-08-273-26/+64
| | | | | | | | | | | This change moves the actual stack pointer manipulation into the legalizer, available to targets via lower(). The codegen is slightly different because we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than adding a negative amount via G_GEP. Differential Revision: https://reviews.llvm.org/D66678 llvm-svn: 370104
* DAG: computeNumSignBits for MULMatt Arsenault2019-08-271-0/+12
| | | | | | | | | | Copied directly from the IR version. Most of the testcases I've added for this are somewhat problematic because they really end up testing the yet to be implemented version for MUL_I24/MUL_U24. llvm-svn: 370099
* [DAGCombiner] cancel fnegs from multiplied operands of FMASanjay Patel2019-08-271-15/+29
| | | | | | | | | | | | | | | | | | (-X) * (-Y) + Z --> X * Y + Z This is a missing optimization that shows up as a potential regression in D66050, so we should solve it first. We appear to be partly missing this fold in IR as well. We do handle the simpler case already: (-X) * (-Y) --> X * Y And it might be beneficial to make the constraint less conservative (eg, if both operands are cheap, but not necessarily cheaper), but that causes infinite looping for the existing fmul transform. Differential Revision: https://reviews.llvm.org/D66755 llvm-svn: 370071
* Revert "[CodeGen] Do the Simple Early Return in block-placement pass to ↵Jinsong Ji2019-08-271-40/+0
| | | | | | | | | | | optimize the blocks" This reverts commit b3d258fc44b588f06eb35f8e4b9a6d1fc859acec. @skatkov is reporting crash in D63972#1646303 Contacted @ZhangKang, and revert the commit on behalf of him. llvm-svn: 370069
* [GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFCPetar Avramovic2019-08-271-27/+11
| | | | | | | | | Main difference is in the way Hi for Long shift (HiL) is made. G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value. Differential Revision: https://reviews.llvm.org/D66589 llvm-svn: 370064
* [GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAGPetar Avramovic2019-08-271-10/+10
| | | | | | | | | Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062
* [DAGCombiner] Add node to the worklist in topological order in ↵Amaury Sechet2019-08-271-4/+4
| | | | | | | | | | | | | | | | parallelizeChainedStores Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66659 llvm-svn: 370056
* [DAGCombiner] Add node to the worklist in topological order after ↵Amaury Sechet2019-08-271-1/+1
| | | | | | | | | | | | | | | | relegalization. Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66702 llvm-svn: 370040
* [SelectionDAGBuilder] Hide existence of ConstantDataVector vector from ↵Craig Topper2019-08-271-6/+5
| | | | | | | | | | | | | | | visitGetElementPtr. ConstantDataVector is a specialized verison of ConstantVector that stores data in a packed array of bits instead of as individual pointers to other Constants. But we really shouldn't expose that if we can void it. And we should handle regular ConstantVector equally well. This removes a dyn_cast to ConstantDataVector and just calls getSplatValue directly on a Constant* if the type is a vector. llvm-svn: 370018
* [SelectionDAGBuilder] Fix typo in comment. NFCCraig Topper2019-08-271-1/+1
| | | | llvm-svn: 370017
* Revert r369927 - [DAGCombiner] Remove a bunch of redundant AddToWorklist calls.Richard Trieu2019-08-271-20/+121
| | | | | | | This change causes instrumented builds of Clang to have a fatal error in the backend. https://reviews.llvm.org/D66537 has the details. llvm-svn: 370006
* Debug Info: Support for DW_AT_export_symbols for anonymous structsShafik Yaghmour2019-08-261-0/+3
| | | | | | | | | | | | | | | | | | | | This implements the DWARF 5 feature described in: http://dwarfstd.org/ShowIssue.php?issue=141212.1 To support recognizing anonymous structs: struct A { struct { // Anonymous struct int y; }; } a This patch adds support for the new flag in constructTypeDIE(...) and test to verify this change. Differential Revision: https://reviews.llvm.org/D66605 llvm-svn: 369969
* [DWARF] Rename getDwarf5OrGNUCallSite{Attr,Tag}, NFCVedant Kumar2019-08-263-23/+17
| | | | llvm-svn: 369967
* [DWARF] Pick the DWARF5 OP_entry_value opcode on DarwinVedant Kumar2019-08-263-8/+23
| | | | | | | Use the GNU extension for OP_entry_value consistently (i.e. whenever GNU extensions are used for TAG_call_site). llvm-svn: 369966
* [DAGCombiner][X86] Teach SimplifyVBinOp to fold VBinOp (concat X, ↵Craig Topper2019-08-261-17/+19
| | | | | | | | | | undef/constant), (concat Y, undef/constant) -> concat (VBinOp X, Y), VecC This improves the combine I included in D66504 to handle constants in the upper operands of the concat. If we can constant fold them away we can pull the concat after the bin op. This helps with chains of madd reductions on X86 from loop unrolling. The loop madd reduction pattern creates pmaddwd with half the width of the add that follows it using zeroes to fill the upper bits. If we have two of these added together we can pull the zeroes through the accumulating add and then shrink it. Differential Revision: https://reviews.llvm.org/D66680 llvm-svn: 369937
* [DAGCombiner] Remove a bunch of redundant AddToWorklist calls.Amaury Sechet2019-08-261-121/+20
| | | | | | | | | | | | | | | | | Summary: This comes as a first step toward processing the DAG nodes in topological orders. Doing so ensure that arguments of a node are combined before the node itself is combined, which exposes ore opportunities for optimization and/or reduce the amount of patterns a node has to match for. DAGCombiner adding nodes to the worklist is various places causes the nodes to be in a different order from what is expected. In addition, this is reduant because these nodes end up being added to the worklist anyways due to the machinery at line 1621. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66537 llvm-svn: 369927
* [X86][DAGCombiner] Teach narrowShuffle to use concat_vectors instead of ↵Craig Topper2019-08-251-0/+31
| | | | | | | | | | | | | | | | | | | | | inserting into undef Summary: Concat_vectors is more canonical during early DAG combine. For example, its what's used by SelectionDAGBuilder when converting IR shuffles into SelectionDAG shuffles when element counts between inputs and mask don't match. We also have combines in DAGCombiner than can pull concat_vectors through a shuffle. See partitionShuffleOfConcats. So it seems like concat_vectors is a better operation to use here. I had to teach DAGCombiner's SimplifyVBinOp to also handle concat_vectors with undef. I haven't checked yet if we can remove the INSERT_SUBVECTOR version in there or not. I didn't want to mess with the other caller of getShuffleHalfVectors that's used during shuffle lowering where insert_subvector probably is what we want to produce so I've enabled this via a boolean passed to the function. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66504 llvm-svn: 369872
* [PowerPC][AIX] Adds support for writing the .data section in assembly filesXing Xue2019-08-252-2/+6
| | | | | | | | | | | | | | | | | Summary: Adds support for generating the .data section in assembly files for global variables with a non-zero initialization. The support for writing the .data section in XCOFF object files will be added in a follow-on patch. Any relocations are not included in this patch. Reviewers: hubert.reinterpretcast, sfertile, jasonliu, daltenty, Xiangling_L Reviewed by: hubert.reinterpretcast Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, wuzish, shchenz, DiggerLin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66154 llvm-svn: 369869
* [SDAG] Fold umul_lohi with 0 or 1 multiplicandNikita Popov2019-08-251-0/+12
| | | | | | | | | | These can turn up during multiplication legalization. In principle these should also apply to smul_lohi, but I wasn't able to figure out how to produce those with the necessary operands. Differential Revision: https://reviews.llvm.org/D66380 llvm-svn: 369864
* Removing block comments from CodeView records in assembly files & related ↵Nilanjana Basu2019-08-251-27/+0
| | | | | | code cleanup llvm-svn: 369860
* [GlobalISel] Introduce a G_DYN_STACKALLOC opcode to represent dynamic allocas.Amara Emerson2019-08-241-0/+21
| | | | | | | | | This just adds the opcode and verifier, it will be used to replace existing dynamic alloca handling in a subsequent patch. Differential Revision: https://reviews.llvm.org/D66677 llvm-svn: 369833
* [LLVM][NFC] remove unused fieldsGuillaume Chatelet2019-08-231-2/+0
| | | | | | | | | | | | | | | | | | | Summary: Here is the commit introducing the fields https://github.com/llvm/llvm-project/commit/cf6749e4c091 It dates back from 2006 and was used by AArch64 backend. There is no more reference to these fields in the whole codebase so I think it's fine. Reviewers: courbet Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66683 llvm-svn: 369810
* [GlobalISel] Legalizer: Retry combining illegal artifacts as long as there ↵Volkan Keles2019-08-231-3/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | new artifacts Summary: Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s possible to combine them after processing the rest of the instruction because the legalization is likely to generate more artifacts that allow ArtifactCombiner to combine away them. Instead, move illegal artifacts to another list called RetryList and wait until all of the instruction in InstList are legalized. After that, check if there is any new artifacts and try to combine them again if that’s the case. If not, abort. The idea is similar to D59339, but the approach is a bit different. This patch fixes the issue described above, but the legalizer still may be unable to handle some cases depending on when to legalize artifacts. So, in the long run, we probably need a different legalization strategy that handles this dependency in a better way. Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette Reviewed By: dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65894 llvm-svn: 369805
* Do a sweep of symbol internalization. NFC.Benjamin Kramer2019-08-232-11/+15
| | | | llvm-svn: 369803
* RegScavenger: Use RegisterMatt Arsenault2019-08-231-17/+17
| | | | llvm-svn: 369794
* [SelectionDAG][X86] Enable iX SimplifyDemandedBits to vXi1 ↵Craig Topper2019-08-231-3/+1
| | | | | | | | | | | | | | | | SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression Patch showing the effect of enabling bool vector oversimplification. Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify: insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i Preventing the removal of the AND to clear the upper bits of result Differential Revision: https://reviews.llvm.org/D53022 llvm-svn: 369780
* [DebugInfo] Remove invalidated locations during LiveDebugValuesJeremy Morse2019-08-231-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | LiveDebugValues gives variable locations to blocks, but it should also take away. There are various circumstances where a variable location is known until a loop backedge with a different location is detected. In those circumstances, where there's no agreement on the variable location, it should be undef / removed, otherwise we end up picking a location that's valid on some loop iterations but not others. However, LiveDebugValues doesn't currently do this, see the new testcase attached. Without this patch, the location of !3 is assumed to be %bar through the loop. Once it's added to the In-Locations list, it's never removed, even though the later dbg.value(0... of !3 makes the location un-knowable. This patch checks during block-location-joining to see whether any previously-present locations have been removed in a predecessor. If they have, the live-ins have changed, and the block needs reprocessing. Similarly, in transferTerminator, assign rather than |= the Out-Locations after processing a block, as we may have deleted some previously valid locations. This will mean that LiveDebugValues performs more propagation -- but that's necessary for it being correct. Differential Revision: https://reviews.llvm.org/D66599 llvm-svn: 369778
* [DAGCombine] GetNegatedExpression - add FMA\FMAD supportSimon Pilgrim2019-08-231-1/+52
| | | | | | | | If the accumulator and either of the multiply operands are negatable then we can we negate the entire expression. Differential Revision: https://reviews.llvm.org/D63141 llvm-svn: 369746
* IR. Change strip* family of functions to not look through aliases.Peter Collingbourne2019-08-222-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697
* GlobalISel: Don't create G_UADDE with constant false carry inMatt Arsenault2019-08-221-5/+7
| | | | | | | | The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673
* [MachO][TLOF] Use hasLocalLinkage to determine if indirect symbol is localFrancis Visoiu Mistrih2019-08-222-8/+6
| | | | | | | | | | | | | | | | | | | | | Local symbols in the indirect symbol table contain the value `INDIRECT_SYMBOL_LOCAL` and the corresponding __pointers entry must contain the address of the target. In r349060, I added support for local symbols in the indirect symbol table, which was checking if the symbol `isDefined` && `!isExternal` to determine if the symbol is local or not. It turns out that `isDefined` will return false if the user of the symbol comes before its definition, and we'll again generate .long 0 which will be the symbol at the adress 0x0. Instead of doing that, use GlobalValue::hasLocalLinkage() to check if the symbol is local. Differential Revision: https://reviews.llvm.org/D66563 llvm-svn: 369671
* [MBP] Disable aggressive loop rotate in plain modeGuozhi Wei2019-08-221-36/+80
| | | | | | | | | | Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664
* [DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal ↵Amaury Sechet2019-08-221-32/+1
| | | | | | | | | | | | | | | | computations Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66548 llvm-svn: 369662
* [SlotIndexes] Add print-slotindexes to disable printing slotindexesJinsong Ji2019-08-221-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When we print the IR with --print-after/before-*, SlotIndexes will be printed whenever available (We haven't freed it). This introduces some noises when we try to compare the IR among different optimizations. eg: -print-before=machine-cp will print SlotIndexes for 1st machine-cp pass, but NOT for 2nd machine-cp; -print-after=machine-cp will NOT print SlotIndexes for both machine-cp passes. So SlotIndexes in 1st pass introduce noises when differing these IRs. This patch introduces an option to hide indexes. Reviewers: stoklund, thegameg, qcolombet Reviewed By: thegameg Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66500 llvm-svn: 369650
* [TargetLowering] Remove optional arguments passing to makeLibCallShiva Chen2019-08-225-85/+157
| | | | | | | | | | The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622
* [COFF] Fix section name for constants larger than 64 bits on WindowsFangrui Song2019-08-221-1/+2
| | | | | | | | | | | | APIntToHexString returns wrong value ("0000000000000000ffffffffffffffff") for integer larger than 64 bits, and thus TargetLoweringObjectFileCOFF::getSectionForConstant returns same section name for all numbers larger than 64 bits. This patch tries to fix it. Differential Revision: https://reviews.llvm.org/D66458 Patch by Senran Zhang llvm-svn: 369610
* [MVT] Add v16f16 and v32f16 vectors.Craig Topper2019-08-211-0/+2
| | | | | | | | | I might look at improving PR43065 which will require being able to mark a 256 and 512 bit vector of f16 as Legal. Differential Revision: https://reviews.llvm.org/D66515 llvm-svn: 369565
* [DAGCombiner] Remove mostly redundant calls to AddToWorklistAmaury Sechet2019-08-211-2/+1
| | | | | | | | | | | | | | | | | Summary: These calls change the order in which some nodes are processed and so have an effect on codegen. The change in fixup-bw-copy.ll is due to (and (load anyext)) gets transformed into (load zext) while previously the and was removed by SimplifyDemandedBits, so the (load anyext) remained. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66543 llvm-svn: 369561
* GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sourcesMatt Arsenault2019-08-211-0/+20
| | | | | | | | This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547
* Improving CodeView debug info type record's inline commentsNilanjana Basu2019-08-211-20/+20
| | | | llvm-svn: 369533
OpenPOWER on IntegriCloud