summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [DAG] Cleanup unused nodes on failed store-to-load forward combine.Nirav Dave2019-02-071-9/+21
| | | | llvm-svn: 353416
* [BranchFolding] Remove dead code for handling EHPad blocksCraig Topper2019-02-071-23/+0
| | | | | | | | | | | | | | | | Summary: This code tries to handle the case where IBB is an EHPad, but there's an earlier check that uses PBB->hasEHPadSuccessor(). Where PBB is a predecessor of IBB. The hasEHPadSuccessor function would have visited IBB and seen that it was an EHPad and returned false. This would prevent us from reaching this code with IBB as an EHPad. Looks like this code was originally added in rL37427 (ancient) and made dead in rL143001. Reviewers: rnk, void, efriedma Reviewed By: rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57358 llvm-svn: 353375
* Remove reference to non-existent function. NFC.Sam Clegg2019-02-071-2/+1
| | | | | | | | This comment is old. The code in question was removed in rL203174 Differential Revision: https://reviews.llvm.org/D57856 llvm-svn: 353352
* [DAG] Immediately cleanup unused nodes from extend-based combines.Nirav Dave2019-02-061-2/+7
| | | | llvm-svn: 353338
* Move IR flag handling directly into builder calls for cases translated from ↵Michael Berg2019-02-062-43/+48
| | | | | | | | | | | | | | Instructions in GlobalIsel Reviewers: aditya_nandakumar, volkan Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, volkan, Petar.Avramovic Differential Revision: https://reviews.llvm.org/D57630 llvm-svn: 353336
* [SelectionDAG] Cleanup some code comments. NFCBjorn Pettersson2019-02-061-4/+4
| | | | | | | | | | Don't repeat the function name in some doxygen comments. (Just a minor cleanup, while testing to push from the git monorepo setup.) llvm-svn: 353317
* [GlobalISel][NFC] Gardening: Factor out code for simple unary intrinsicsJessica Paquette2019-02-061-78/+58
| | | | | | | | | | | | | There was a lot of repeated code wrt unary math intrinsics in translateKnownIntrinsic. This factors out the repeated MIRBuilder code into two functions: translateSimpleUnaryIntrinsic and getSimpleUnaryIntrinsicOpcode. This simplifies adding simple unary intrinsics, since after this, all you have to do is add the mapping to SimpleUnaryIntrinsicOpcodes. Differential Revision: https://reviews.llvm.org/D57774 llvm-svn: 353316
* [InlineAsm][X86] Add backend support for X86 flag output parameters.Nirav Dave2019-02-062-11/+19
| | | | | | | Allow custom handling of inline assembly output parameters and add X86 flag parameter support. llvm-svn: 353307
* [SelectionDAGBuilder] Refactor Inline Asm output check. NFCI.Nirav Dave2019-02-061-13/+26
| | | | llvm-svn: 353305
* [DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode.Clement Courbet2019-02-061-8/+8
| | | | | | | GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with static typing instead of runtime cast. llvm-svn: 353291
* GlobalISel: Verify G_GEPMatt Arsenault2019-02-051-0/+16
| | | | llvm-svn: 353209
* [DEBUG_INFO][NVPTX] Generate DW_AT_address_class to get the values in debugger.Alexey Bataev2019-02-051-2/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: According to https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf, the compiler should emit the DW_AT_address_class attribute for all variable and parameter. It means, that DW_AT_address_class attribute should be used in the non-standard way to support compatibility with the cuda-gdb debugger. Clang is able to generate the information about the variable address class. This information is emitted as the expression sequence `DW_OP_constu <DWARF Address Space> DW_OP_swap DW_OP_xderef`. The patch tries to find all such expressions and transform them into `DW_AT_address_class <DWARF Address Space>` if target is NVPTX and the debugger is gdb. If the expression is not found, then default values are used. For the local variables <DWARF Address Space> is set to ADDR_local_space(6), for the globals <DWARF Address Space> is set to ADDR_global_space(5). The values are taken from the table in the same section 5.2. CUDA-Specific DWARF Definitions. Reviewers: echristo, probinson Subscribers: jholewinski, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D57157 llvm-svn: 353203
* [CGP] Add support for sinking operands to their users, if they are free.Florian Hahn2019-02-051-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves code generation for some AArch64 ACLE intrinsics. It adds support to CGP to duplicate and sink operands to their user, if they can be folded into a target instruction, like zexts and sub into usubl. It adds a TargetLowering hook shouldSinkOperands, which looks at the operands of instructions to see if sinking is profitable. I decided to add a new target hook, as for the sinking to be profitable, at least on AArch64, we have to look at multiple operands of an instruction, instead of looking at the users of a zext for example. The sinking is done in CGP, because it works around an instruction selection limitation. If instruction selection is not limited to a single basic block, this patch should not be needed any longer. Alternatively this could be done in the LoopSink pass, which tries to undo LICM for instructions in blocks that are not executed frequently. Note that we do not force the operands to sink to have a single user, because we duplicate them before sinking. Therefore this is only desirable if they really can be done for free. Additionally we could consider the impact on live ranges later on. This should fix https://bugs.llvm.org/show_bug.cgi?id=40025. As for performance, we have internal code that uses intrinsics and can be speed up by 10% by this change. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D57377 llvm-svn: 353152
* [DAG] BaseIndexOffset: FrameIndexSDNodes with the same FrameIndex compare equal.Clement Courbet2019-02-051-5/+9
| | | | | | | | | | | | Reviewers: niravd Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57692 llvm-svn: 353143
* GlobalISel: Fix verifier crashing on non-register operandsMatt Arsenault2019-02-051-1/+6
| | | | | | Also correct the wording of error on subregisters. llvm-svn: 353128
* GlobalISel: Consolidate load/store legalizationMatt Arsenault2019-02-051-103/+14
| | | | | | | | | | The fewerElementsVectors implementation for load/stores handles the scalar reduction case just as well, so drop the redundant code in narrowScalar. This also introduces support for narrowing irregular size breakdowns for scalars. llvm-svn: 353125
* [DAGCombiner] Discard pointer info when combining extract_vector_elt of a ↵Craig Topper2019-02-051-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | vector load when the index isn't constant Summary: If the index isn't constant, this transform inserts a multiply and an add on the index to calculating the base pointer for a scalar load. But we still create a memory operand with an offset of 0 and the size of the scalar access. But the access is really to an unknown offset within the original access size. This can cause the machine scheduler to incorrectly calculate dependencies between this load and other accesses. In the case we saw, there was a 32 byte vector store that was split into two 16 byte stores, one with offset 0 and one with offset 16. The size of the memory operand for both was 16. The scheduler correctly detected the alias with the offset 0 store, but not the offset 16 store. This patch discards the pointer info so we don't incorrectly detect aliasing. I wasn't sure if we could keep using the original offset and size without risking some other transform on the load changing the size. I tried to reduce a test case, but there's still a lot of memory operations needed to get the scheduler to do the bad reordering. So it looked pretty fragile to maintain. Reviewers: efriedma Reviewed By: efriedma Subscribers: arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57616 llvm-svn: 353124
* GlobalISel: Implement narrowScalar for selectMatt Arsenault2019-02-051-0/+53
| | | | | | | | | | Don't handle vector conditions. I think this can be merged in the future with fewerElementsVectorSelect, although this becomes slightly tricky with a vector condition. llvm-svn: 353122
* GlobalISel: Combine g_extract with g_merge_valuesMatt Arsenault2019-02-041-0/+1
| | | | | | | | | | | | | | Try to use the underlying source registers. This enables legalization in more cases where some irregular operations are widened and others narrowed. This seems to make the test_combines_2 AArch64 test worse, since the MERGE_VALUES has multiple uses. Since this should be required for legalization, a hasOneUse check is probably inappropriate (or maybe should only be used if the merge is legal?). llvm-svn: 353121
* GlobalISel: Enforce operand types for constantsMatt Arsenault2019-02-041-0/+23
| | | | | | | | A number of of tests were using imm operands, not cimm. Since CSE relies on the exact ConstantInt* pointer used, and implicit conversions are generally evil, also enforce the bitsize of the types. llvm-svn: 353113
* GlobalISel: Verify g_selectMatt Arsenault2019-02-041-24/+40
| | | | | | | Factor the common vector element consistency check many instructions need out, although this makes the error messages worse. llvm-svn: 353112
* MachineVerifier: Move verification of G_* instructions to functionMatt Arsenault2019-02-041-100/+117
| | | | llvm-svn: 353111
* [WebAssembly] MC: Mark more function aliases as functionsSam Clegg2019-02-041-1/+11
| | | | | | | | | | | | | Aliases of functions are now marked as function symbols even if they are bitcast to some other other non-function type. This is important for WebAssembly where object and function symbols can't alias each other. Fixes PR38866 Differential Revision: https://reviews.llvm.org/D57538 llvm-svn: 353109
* MIR: Validate LLT types when parsingMatt Arsenault2019-02-041-6/+35
| | | | llvm-svn: 353107
* GlobalISel: Fix not calling observer when legalizing bitcount opsMatt Arsenault2019-02-041-11/+12
| | | | | | This was hiding bugs from never legalizing the source type. llvm-svn: 353102
* [DEBUGINFO] Reposting r352642: Handle restore instructions in LiveDebugValuesWolfgang Pieb2019-02-043-92/+216
| | | | | | | | | | | | | | | | | The LiveDebugValues pass recognizes spills but not restores, which can cause large gaps in location information for some variables, depending on control flow. This patch make LiveDebugValues recognize restores and generate appropriate DBG_VALUE instructions. This patch was posted previously with r352642 and reverted in r352666 due to buildbot errors. A missing return statement was the cause for the failures. Reviewers: aprantl, NicolaPrica Differential Revision: https://reviews.llvm.org/D57271 llvm-svn: 353089
* GlobalISel: Fix CSE handling of buildConstantMatt Arsenault2019-02-042-40/+51
| | | | | | | | | | | | | | | | This fixes two problems with CSE done in buildConstant. First, this would hit an assert when used with a vector result type. Solve this by allowing CSE on the vector elements, but not on the result vector for now. Second, this was also performing the CSE based on the input ConstantInt pointer. The underlying buildConstant could potentially convert the constant depending on the result type, giving in a different ConstantInt*. Stop allowing the APInt and ConstantInt forms from automatically casting to the result type to avoid any similar problems in the future. llvm-svn: 353077
* GlobalISel: Fix moreElementsToNextPow2Matt Arsenault2019-02-042-8/+7
| | | | | | | | | This was completely broken. The condition was inverted, and changed the element type for vectors of pointers. Fixes bug 40592. llvm-svn: 353069
* Revert "[GlobalISel] Add IRTranslator support for G_FFLOOR"Jessica Paquette2019-02-041-5/+0
| | | | | | | | | This reverts commit 8bbd570fd5205a04d88d2e5513a6e4adbd028039. Apparently adding ffloor breaks AMDGPU somehow, so I need to back this out while I look into it. llvm-svn: 353064
* [Intrinsic] Unsigned Fixed Point Multiplication IntrinsicLeonard Chan2019-02-049-37/+90
| | | | | | | | | | | | | Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D55625 llvm-svn: 353059
* [GlobalISel] Add IRTranslator support for G_FFLOORJessica Paquette2019-02-041-0/+5
| | | | | | | | | | Follow-up to https://reviews.llvm.org/D57484 Adds G_FFLOOR to translateKnownIntrinsic and update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D57485 llvm-svn: 353058
* [CGP] use IRBuilder to simplify codeSanjay Patel2019-02-041-26/+25
| | | | | | | | | | | | | | | | This is no-functional-change-intended although there could be intermediate variations caused by a difference in the debug info produced by setting that from the builder's insertion point. I'm updating the IR test file associated with this code just to show that the naming differences from using the builder are visible. The motivation for adding a helper function is that we are likely to extend this code to deal with other overflow ops. llvm-svn: 353056
* GlobalISel: Fix formatting of debug outputMatt Arsenault2019-02-041-3/+3
| | | | | | | There was a missing space before the instruction name, and the newline is redundant since MI::print by default adds one. llvm-svn: 353046
* [DAGCombine] Add ADD(SUB,SUB) combinesSimon Pilgrim2019-02-041-0/+12
| | | | | | | | Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case. We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage. llvm-svn: 353044
* [AsmPrinter] Remove hidden flag -print-schedule.Andrea Di Biagio2019-02-044-124/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043
* [SelectionDAG] Add a BaseIndexOffset::print() method for debugging.Clement Courbet2019-02-041-0/+19
| | | | llvm-svn: 353028
* [CGP] adjust target constraints for forming uaddoSanjay Patel2019-02-031-8/+11
| | | | | | | | | | | | | | | | | | | There are 2 changes visible here: 1. There's no reason to limit this transform based on number of condition registers. That diff allows PPC to produce slightly better (dot-instructions should be generally good) code. Note: someone that cares about PPC codegen might want to look closer at that output because it seems like we could still improve this. 2. We (probably?) should not bother trying to form uaddo (or other overflow ops) when there's no target support for such an op. This goes beyond checking whether the op is expanded because both PPC and AArch64 show better codegen for standard types regardless of whether the op is legal/custom. llvm-svn: 353001
* [CGP] refactor optimizeCmpExpression (NFCI)Sanjay Patel2019-02-031-36/+38
| | | | | | | | | | | | | | | | This is not truly NFC because we are bailing out without a TLI now. That should not be a real concern though because there should be a TLI in any real-world scenario. That seems better than passing around a pointer and then checking it for null-ness all over the place. The motivation is to fix what appears to be an unintended restriction on the uaddo transform - hasMultipleConditionRegisters() shouldn't be reason to limit the transform. llvm-svn: 352988
* GlobalISel: Implement widenScalar for G_UNMERGE_VALUESMatt Arsenault2019-02-031-39/+83
| | | | | | | | | For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979
* GlobalISel: Implement widenScalar for G_EXTRACT vector sourcesMatt Arsenault2019-02-021-0/+26
| | | | | | Handle the basic element extract case. llvm-svn: 352978
* GlobalISel: Legalization for inttoptr/ptrtointMatt Arsenault2019-02-022-4/+46
| | | | llvm-svn: 352973
* [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.Simon Pilgrim2019-02-021-1/+1
| | | | | | | | We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961
* [CodeGen] Be as conservative about atomic accesses as for volatilePhilip Reames2019-02-013-2/+7
| | | | | | | | | | | | | | Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context. Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change. It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check. We should probably adjust that class hierarchy long term, but for now, that seperation is useful. I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests. Differential Revision: https://reviews.llvm.org/D57596 llvm-svn: 352937
* [COFF, ARM64] Fix localaddress to handle stack realignment and variable size ↵Mandeep Singh Grang2019-02-012-7/+1
| | | | | | | | | | | | | | | | objects Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present. Reviewers: rnk, efriedma, ssijaric, TomTan Reviewed By: rnk, efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D57183 llvm-svn: 352923
* [opaque pointer types] Pass value type to GetElementPtr creation.James Y Knight2019-02-011-4/+4
| | | | | | | | | This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913
* [opaque pointer types] Pass value type to LoadInst creation.James Y Knight2019-02-0111-29/+41
| | | | | | | | | This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
* [opaque pointer types] Pass function types to CallInst creation.James Y Knight2019-02-015-14/+12
| | | | | | | | | This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909
* [DWARF v5] Fix DWARF emitter and consumer to produce/expect a uleb for a ↵Wolfgang Pieb2019-02-011-2/+4
| | | | | | | | | | location description's length. Reviewer: davide, JDevliegere Differential Revision: https://reviews.llvm.org/D57550 llvm-svn: 352889
* [SDAG] improve variable names; NFCSanjay Patel2019-02-011-23/+22
| | | | | | | | The version of FoldConstantArithmetic() that takes arbitrary nodes was confusingly naming those nodes as constants when they might not be; also "Cst" reads like "Cast". llvm-svn: 352884
* [TargetLowering] try harder to determine undef elements of vector binopsSanjay Patel2019-02-011-7/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This might be the start of tracking all vector element constants generally if we take it to its logical conclusion, but let's stop here and make sure this is correct/beneficial so far. The affected tests require a convoluted path before they get simplified currently because we don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands directly in SimplifyDemandedVectorElts(). That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So something like this is happening: 1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that originates from a shuffle. 2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes. 3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again. 4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing those to become full undef constants. 5. Simplify the binop now that it has a full undef operand. As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution. We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to track NaN for FP ops too (assuming we don't have fast-math-flags set). Differential Revision: https://reviews.llvm.org/D57066 llvm-svn: 352880
OpenPOWER on IntegriCloud