summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombine] Simplify funnel shifts with undef/zero args to bitshiftsSimon Pilgrim2019-02-101-2/+41
| | | | | | | | Now that we have SimplifyDemandedBits support for funnel shifts (rL353539), we need to simplify funnel shifts back to bitshifts in cases where either argument has been folded to undef/zero. Differential Revision: https://reviews.llvm.org/D58009 llvm-svn: 353645
* [TargetLowering] refactor setcc folds to fix another miscompile (PR40657)Sanjay Patel2019-02-101-55/+55
| | | | | | | | | | SimplifySetCC still has much room for improvement, but this should fix the remaining problem examples from: https://bugs.llvm.org/show_bug.cgi?id=40657 The initial fix for this problem was rL353615. llvm-svn: 353639
* [TargetLowering] add tests to show effect of setcc sub->shift; NFCSanjay Patel2019-02-091-1/+0
| | | | | | | | | There's effectively no difference for the cases with variables. We just trade a sub for an add on those. But the case with a subtract from constant would require an extra move instruction on x86, so this looks like a reasonable generic combine. llvm-svn: 353619
* [TargetLowering] avoid miscompile in setcc transform (PR40657)Sanjay Patel2019-02-091-1/+3
| | | | llvm-svn: 353615
* Revert "[SelectionDAG] Extract [US]MULO expansion into TL method; NFC"Nikita Popov2019-02-092-112/+141
| | | | | | | | This reverts commit r353611. Triggers an assertion during the libcall expansion on ARM. llvm-svn: 353612
* [SelectionDAG] Extract [US]MULO expansion into TL method; NFCNikita Popov2019-02-092-141/+112
| | | | | | | | | In preparation for supporting vector expansion. Also drop a variant of ExpandLibCall, of which the MULO expansions were the only user. llvm-svn: 353611
* Re-apply r353553 "[GISel][NFC]: Add missing call to record CSE hits in the ↵Francis Visoiu Mistrih2019-02-082-9/+10
| | | | | | | | CSEMIRBuilder" With a fix after r353563 that adds some more opcodes. llvm-svn: 353579
* Revert r353553 "[GISel][NFC]: Add missing call to record CSE hits in the ↵Francis Visoiu Mistrih2019-02-082-10/+9
| | | | | | | | | | | | CSEMIRBuilder" This reverts commit r353553. This breaks CodeGen/AArch64/GlobalISel/legalize-ext-csedebug-output.mir: http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-incremental/57963/console llvm-svn: 353575
* Implementation of asm-goto support in LLVMCraig Topper2019-02-0815-19/+99
| | | | | | | | | | | | | | | | | | | | | | | | | This patch accompanies the RFC posted here: http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html This patch adds a new CallBr IR instruction to support asm-goto inline assembly like gcc as used by the linux kernel. This instruction is both a call instruction and a terminator instruction with multiple successors. Only inline assembly usage is supported today. This also adds a new INLINEASM_BR opcode to SelectionDAG and MachineIR to represent an INLINEASM block that is also considered a terminator instruction. There will likely be more bug fixes and optimizations to follow this, but we felt it had reached a point where we would like to switch to an incremental development model. Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii Differential Revision: https://reviews.llvm.org/D53765 llvm-svn: 353563
* [DAGCombine] Optimize pow(X, 0.75) to sqrt(X) * sqrt(sqrt(X))Nemanja Ivanovic2019-02-081-4/+14
| | | | | | | | | | | | The sqrt case is faster and we already do this for the case where the exponent is 0.25. This adds the 0.75 case which is also not sensitive to signed zeros. Patch by Whitney Tsang (Whitney) Differential revision: https://reviews.llvm.org/D57434 llvm-svn: 353557
* [GISel][NFC]: Add missing call to record CSE hits in the CSEMIRBuilderAditya Nandakumar2019-02-082-9/+10
| | | | | | | | | | https://reviews.llvm.org/D57932 Add some logging + tests to make sure CSEInfo prints debug output. reviewed by: arsenm llvm-svn: 353553
* [TargetLowering] Use ISD::FSHR in expandFixedPointMulSimon Pilgrim2019-02-081-5/+2
| | | | | | Replace OR(SHL,SRL) pattern with ISD::FSHR (legalization expands this later if necessary) - this helps with the scale == 0 'undefined' drop-through case that was discussed on D55720. llvm-svn: 353546
* [TargetLowering] Add SimplifyDemandedBits funnel shift support Simon Pilgrim2019-02-082-0/+43
| | | | llvm-svn: 353539
* Revert r353416 "[DAG] Cleanup unused nodes on failed store-to-load forward ↵Nirav Dave2019-02-081-21/+9
| | | | | | | | combine." This cleanup causes out-of-tree crashes. llvm-svn: 353527
* [MIPS GlobalISel] Select any extending load and truncating storePetar Avramovic2019-02-081-7/+0
| | | | | | | | | | | | | | | | | | Make behavior of G_LOAD in widenScalar same as for G_ZEXTLOAD and G_SEXTLOAD. That is perform widenScalarDst to size given by the target and avoid additional checks in common code. Targets can reorder or add additional rules in LegalizeRuleSet for the opcode to achieve desired behavior. Select extending load that does not have specified type of extension into zero extending load. Select truncating store that stores number of bytes indicated by size in MachineMemoperand. Differential Revision: https://reviews.llvm.org/D57454 llvm-svn: 353520
* AMDGPU/GlobalISel: Legalize addrspacecastMatt Arsenault2019-02-081-0/+1
| | | | | | | Use a placeholder constant for now on targets that need the load from the queue ptr. llvm-svn: 353497
* [CodeGen] Handle vector UADDO, SADDO, USUBO, SSUBONikita Popov2019-02-074-2/+167
| | | | | | | | | | | | | | | This is part of https://bugs.llvm.org/show_bug.cgi?id=40442. Vector legalization is implemented for the add/sub overflow opcodes. UMULO/SMULO are also handled as far as legalization is concerned, but they don't support vector expansion yet (so no tests for them). The vector result widening implementation is suboptimal, because it could result in a legalization loop. Differential Revision: https://reviews.llvm.org/D57639 llvm-svn: 353464
* GlobalISel: Try to fix bot failuresMatt Arsenault2019-02-071-5/+5
| | | | | | Don't rely on order of evaluation of function arguments. llvm-svn: 353460
* [DAGCombiner] (add (umax X, C), -C) --> (usubsat X, C) (PR40111)Simon Pilgrim2019-02-071-0/+12
| | | | | | | | | | Move the (add (umax X, C), -C) --> (usubsat X, C) X86 combine into generic DAGCombiner First of a number of saturated arithmetic folds that can be moved out of X86-specific code for PR40111. Differential Revision: https://reviews.llvm.org/D57754 llvm-svn: 353457
* GlobalISel: Implement narrowScalar for shift main typeMatt Arsenault2019-02-072-12/+235
| | | | | | | | | | | | | | | This is pretty much directly ported from SelectionDAG. Doesn't include the shift by non-constant but known bits version, since there isn't a globalisel version of computeKnownBits yet. This shows a disadvantage of targets not specifically which type should be used for the shift amount. If type 0 is legalized before type 1, the operations on the shift amount type use the wider type (which are also less likely to legalize). This can be avoided by targets specifying legalization actions on type 1 earlier than for type 0. llvm-svn: 353455
* Revert "[DAG] Cleanup of unused node in SimplifySelectCC."Nirav Dave2019-02-071-15/+8
| | | | | | Causes ASAN use-after-poison errors. llvm-svn: 353442
* [DAGCombiner] fold add/sub with bool operand based on target's boolean contentsSanjay Patel2019-02-071-3/+16
| | | | | | | | | | | | | | | | | | | I noticed that we are missing this canonicalization in IR: rL352515 ...and then realized that we don't get this right in SDAG either, so this has to be fixed first regardless of what we choose to do in IR. The existing fold was limited to scalars and using the wrong predicate to guard the transform. We have a boolean contents TLI query that can be used to decide which direction to fold. This may eventually lead back to the problems/question in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but it makes no difference to that yet. Differential Revision: https://reviews.llvm.org/D57401 llvm-svn: 353433
* GlobalISel: Implement fewerElementsVector for shiftsMatt Arsenault2019-02-071-30/+130
| | | | | | | | | Introduce a new function which handles instructions with multiple type indices, but have the same number of vector elements. Also legalize v2s16 shifts when applicable. llvm-svn: 353432
* GlobalISel: Try to make legalize rules more useful for vectorsMatt Arsenault2019-02-072-10/+49
| | | | | | | Mostly keep the existing functions on scalars, but add versions which also operate based on the vector element size. llvm-svn: 353430
* [DAG] Cleanup of unused node in SimplifySelectCC.Nirav Dave2019-02-071-8/+15
| | | | llvm-svn: 353428
* [DAG] Cleanup unused node on failed SELECT Combine.Nirav Dave2019-02-071-0/+6
| | | | llvm-svn: 353426
* [DAG] Cleanup unused nodes on failed store-to-load forward combine.Nirav Dave2019-02-071-9/+21
| | | | llvm-svn: 353416
* [BranchFolding] Remove dead code for handling EHPad blocksCraig Topper2019-02-071-23/+0
| | | | | | | | | | | | | | | | Summary: This code tries to handle the case where IBB is an EHPad, but there's an earlier check that uses PBB->hasEHPadSuccessor(). Where PBB is a predecessor of IBB. The hasEHPadSuccessor function would have visited IBB and seen that it was an EHPad and returned false. This would prevent us from reaching this code with IBB as an EHPad. Looks like this code was originally added in rL37427 (ancient) and made dead in rL143001. Reviewers: rnk, void, efriedma Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57358 llvm-svn: 353375
* Remove reference to non-existent function. NFC.Sam Clegg2019-02-071-2/+1
| | | | | | | | This comment is old. The code in question was removed in rL203174 Differential Revision: https://reviews.llvm.org/D57856 llvm-svn: 353352
* [DAG] Immediately cleanup unused nodes from extend-based combines.Nirav Dave2019-02-061-2/+7
| | | | llvm-svn: 353338
* Move IR flag handling directly into builder calls for cases translated from ↵Michael Berg2019-02-062-43/+48
| | | | | | | | | | | | | | Instructions in GlobalIsel Reviewers: aditya_nandakumar, volkan Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, volkan, Petar.Avramovic Differential Revision: https://reviews.llvm.org/D57630 llvm-svn: 353336
* [SelectionDAG] Cleanup some code comments. NFCBjorn Pettersson2019-02-061-4/+4
| | | | | | | | | | Don't repeat the function name in some doxygen comments. (Just a minor cleanup, while testing to push from the git monorepo setup.) llvm-svn: 353317
* [GlobalISel][NFC] Gardening: Factor out code for simple unary intrinsicsJessica Paquette2019-02-061-78/+58
| | | | | | | | | | | | | There was a lot of repeated code wrt unary math intrinsics in translateKnownIntrinsic. This factors out the repeated MIRBuilder code into two functions: translateSimpleUnaryIntrinsic and getSimpleUnaryIntrinsicOpcode. This simplifies adding simple unary intrinsics, since after this, all you have to do is add the mapping to SimpleUnaryIntrinsicOpcodes. Differential Revision: https://reviews.llvm.org/D57774 llvm-svn: 353316
* [InlineAsm][X86] Add backend support for X86 flag output parameters.Nirav Dave2019-02-062-11/+19
| | | | | | | Allow custom handling of inline assembly output parameters and add X86 flag parameter support. llvm-svn: 353307
* [SelectionDAGBuilder] Refactor Inline Asm output check. NFCI.Nirav Dave2019-02-061-13/+26
| | | | llvm-svn: 353305
* [DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode.Clement Courbet2019-02-061-8/+8
| | | | | | | GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with static typing instead of runtime cast. llvm-svn: 353291
* GlobalISel: Verify G_GEPMatt Arsenault2019-02-051-0/+16
| | | | llvm-svn: 353209
* [DEBUG_INFO][NVPTX] Generate DW_AT_address_class to get the values in debugger.Alexey Bataev2019-02-051-2/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf, the compiler should emit the DW_AT_address_class attribute for all variable and parameter. It means, that DW_AT_address_class attribute should be used in the non-standard way to support compatibility with the cuda-gdb debugger. Clang is able to generate the information about the variable address class. This information is emitted as the expression sequence `DW_OP_constu <DWARF Address Space> DW_OP_swap DW_OP_xderef`. The patch tries to find all such expressions and transform them into `DW_AT_address_class <DWARF Address Space>` if target is NVPTX and the debugger is gdb. If the expression is not found, then default values are used. For the local variables <DWARF Address Space> is set to ADDR_local_space(6), for the globals <DWARF Address Space> is set to ADDR_global_space(5). The values are taken from the table in the same section 5.2. CUDA-Specific DWARF Definitions. Reviewers: echristo, probinson Subscribers: jholewinski, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D57157 llvm-svn: 353203
* [CGP] Add support for sinking operands to their users, if they are free.Florian Hahn2019-02-051-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves code generation for some AArch64 ACLE intrinsics. It adds support to CGP to duplicate and sink operands to their user, if they can be folded into a target instruction, like zexts and sub into usubl. It adds a TargetLowering hook shouldSinkOperands, which looks at the operands of instructions to see if sinking is profitable. I decided to add a new target hook, as for the sinking to be profitable, at least on AArch64, we have to look at multiple operands of an instruction, instead of looking at the users of a zext for example. The sinking is done in CGP, because it works around an instruction selection limitation. If instruction selection is not limited to a single basic block, this patch should not be needed any longer. Alternatively this could be done in the LoopSink pass, which tries to undo LICM for instructions in blocks that are not executed frequently. Note that we do not force the operands to sink to have a single user, because we duplicate them before sinking. Therefore this is only desirable if they really can be done for free. Additionally we could consider the impact on live ranges later on. This should fix https://bugs.llvm.org/show_bug.cgi?id=40025. As for performance, we have internal code that uses intrinsics and can be speed up by 10% by this change. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D57377 llvm-svn: 353152
* [DAG] BaseIndexOffset: FrameIndexSDNodes with the same FrameIndex compare equal.Clement Courbet2019-02-051-5/+9
| | | | | | | | | | | | Reviewers: niravd Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57692 llvm-svn: 353143
* GlobalISel: Fix verifier crashing on non-register operandsMatt Arsenault2019-02-051-1/+6
| | | | | | Also correct the wording of error on subregisters. llvm-svn: 353128
* GlobalISel: Consolidate load/store legalizationMatt Arsenault2019-02-051-103/+14
| | | | | | | | | | The fewerElementsVectors implementation for load/stores handles the scalar reduction case just as well, so drop the redundant code in narrowScalar. This also introduces support for narrowing irregular size breakdowns for scalars. llvm-svn: 353125
* [DAGCombiner] Discard pointer info when combining extract_vector_elt of a ↵Craig Topper2019-02-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | vector load when the index isn't constant Summary: If the index isn't constant, this transform inserts a multiply and an add on the index to calculating the base pointer for a scalar load. But we still create a memory operand with an offset of 0 and the size of the scalar access. But the access is really to an unknown offset within the original access size. This can cause the machine scheduler to incorrectly calculate dependencies between this load and other accesses. In the case we saw, there was a 32 byte vector store that was split into two 16 byte stores, one with offset 0 and one with offset 16. The size of the memory operand for both was 16. The scheduler correctly detected the alias with the offset 0 store, but not the offset 16 store. This patch discards the pointer info so we don't incorrectly detect aliasing. I wasn't sure if we could keep using the original offset and size without risking some other transform on the load changing the size. I tried to reduce a test case, but there's still a lot of memory operations needed to get the scheduler to do the bad reordering. So it looked pretty fragile to maintain. Reviewers: efriedma Reviewed By: efriedma Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57616 llvm-svn: 353124
* GlobalISel: Implement narrowScalar for selectMatt Arsenault2019-02-051-0/+53
| | | | | | | | | | Don't handle vector conditions. I think this can be merged in the future with fewerElementsVectorSelect, although this becomes slightly tricky with a vector condition. llvm-svn: 353122
* GlobalISel: Combine g_extract with g_merge_valuesMatt Arsenault2019-02-041-0/+1
| | | | | | | | | | | | | | Try to use the underlying source registers. This enables legalization in more cases where some irregular operations are widened and others narrowed. This seems to make the test_combines_2 AArch64 test worse, since the MERGE_VALUES has multiple uses. Since this should be required for legalization, a hasOneUse check is probably inappropriate (or maybe should only be used if the merge is legal?). llvm-svn: 353121
* GlobalISel: Enforce operand types for constantsMatt Arsenault2019-02-041-0/+23
| | | | | | | | A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113
* GlobalISel: Verify g_selectMatt Arsenault2019-02-041-24/+40
| | | | | | | Factor the common vector element consistency check many instructions need out, although this makes the error messages worse. llvm-svn: 353112
* MachineVerifier: Move verification of G_* instructions to functionMatt Arsenault2019-02-041-100/+117
| | | | llvm-svn: 353111
* [WebAssembly] MC: Mark more function aliases as functionsSam Clegg2019-02-041-1/+11
| | | | | | | | | | | | | Aliases of functions are now marked as function symbols even if they are bitcast to some other other non-function type. This is important for WebAssembly where object and function symbols can't alias each other. Fixes PR38866 Differential Revision: https://reviews.llvm.org/D57538 llvm-svn: 353109
* MIR: Validate LLT types when parsingMatt Arsenault2019-02-041-6/+35
| | | | llvm-svn: 353107
OpenPOWER on IntegriCloud