summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AArch64/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field"Jeremy Morse2020-02-122-5/+5
| | | | | | | | | | | | | | | | | | | | | | | This reverts commit ed29dbaafa49bb8c9039a35f768244c394411fea. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in e39e2b4a79c6 that's technically unrelated. dbg-addr-dse.ll conflicted because 41206b61e30c (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref. (cherry picked from commit 6531a78ac4b5b229bce272706593a0bc873877d7)
* Revert rG6078f2fedcac5797ac39ee5ef3fd7a35ef1202d5 - "[AArch64][GlobalISel]: ↵Simon Pilgrim2020-01-152-42/+0
| | | | | | | | | | | | | Support @llvm.{return,frame}address selection." These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656 ---- Breaks EXPENSIVE_CHECKS builds.
* [AArch64][GlobalISel]: Support @llvm.{return,frame}address selection.Amara Emerson2020-01-142-0/+42
| | | | | | | | | These intrinsics expand to a variable number of instructions so just like in ISelLowering.cpp we use custom code to deal with them. Committing Tim's original patch. Differential Revision: https://reviews.llvm.org/D65656
* [GlobalISel] Change representation of shuffle masks in MachineOperand.Eli Friedman2020-01-131-2/+2
| | | | | | | | | | | | We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663
* Relax opcode checks in test for G_READCYCLECOUNTER to check for only a ↵Douglas Yung2020-01-091-1/+1
| | | | number instead of a specific number.
* [AArch64][GlobalISel] Implement selection of <2 x float> vector splat.Amara Emerson2020-01-092-2/+69
| | | | | | Also requires making G_IMPLICIT_DEF of v2s32 legal. Differential Revision: https://reviews.llvm.org/D72422
* [GlobalISel][AArch64] Import + select LDR*roW and STR*roW patternsJessica Paquette2020-01-092-0/+483
| | | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for selecting a large chunk of the load/store *roW patterns. This is pretty much a straight port of AArch64DAGToDAGISel::SelectAddrModeWRO into GISel. The code is very similar to the XRO code. The main difference is that in the *roW patterns, we want to try and fold in an extend, and *possibly* a shift along with it. A good portion of this patch is refactoring the existing XRO code. - Add selectAddrModeWRO - Factor out the code from selectAddrModeShiftedExtendXReg which is used by both selectAddrModeXRO and selectAddrModeWRO into selectExtendedSHL. This is similar to the function of the same name in AArch64DAGToDAGISel. - Add support for extends to the factored out code in selectExtendedSHL. - Teach getExtendTypeForInst how to handle AND masks that are intended to be used in loads/stores (necessary for this addressing mode.) - Make getExtendTypeForInst not static because moving it made an annoying diff and I wanted to have the WRO/XRO functions close to each other while I was writing the code. Differential Revision: https://reviews.llvm.org/D72426
* [AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets.Amara Emerson2020-01-071-0/+72
| | | | | | | | | | E.g. %addr1 = G_PTR_ADD %base, G_CONSTANT 20 %addr2 = G_PTR_ADD %addr1, G_CONSTANT 8 --> %addr2 = G_PTR_ADD %base, G_CONSTANT 28 Differential Revision: https://reviews.llvm.org/D72351
* GlobalISel: Define G_READCYCLECOUNTERMatt Arsenault2020-01-042-0/+15
|
* Migrate function attribute "no-frame-pointer-elim" to "frame-pointer"="all" ↵Fangrui Song2019-12-243-3/+3
| | | | as cleanups after D56351
* [AArch64] [Windows] Use COFF stubs for calls to extern_weak functionsMartin Storsjö2019-12-232-17/+15
| | | | | | | | | | | | | | | | | | | As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in 6bf108d77a3c. Differential Revision: https://reviews.llvm.org/D71721
* [AArch64] Save FP for leaf functions when disabling frame pointer eliminationFangrui Song2019-12-131-1/+4
| | | | | | | | | | The change allows clang -mno-omit-leaf-frame-pointer to disable frame pointer elimination. This behavior matches X86 and Mips, and also GCC AArch64. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D71168
* [Legalizer] Making artifact combining order-independentRoman Tereshin2019-12-134-19/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Legalization algorithm is complicated by two facts: 1) While regular instructions should be possible to legalize in an isolated, per-instruction, context-free manner, legalization artifacts can only be eliminated in pairs, which could be deeply, and ultimately arbitrary nested: { [ () ] }, where which paranthesis kind depicts an artifact kind, like extend, unmerge, etc. Such structure can only be fully eliminated by simple local combines if they are attempted in a particular order (inside out), or alternatively by repeated scans each eliminating only one innermost pair, resulting in O(n^2) complexity. 2) Some artifacts might in fact be regular instructions that could (and sometimes should) be legalized by the target-specific rules. Which means failure to eliminate all artifacts on the first iteration is not a failure, they need to be tried as instructions, which may produce more artifacts, including the ones that are in fact regular instructions, resulting in a non-constant number of iterations required to finish the process. I trust the recently introduced termination condition (no new artifacts were created during as-a-regular-instruction-retrial of artifacts not eliminated on the previous iteration) to be efficient in providing termination, but only performing the legalization in full if and only if at each step such chains of artifacts are successfully eliminated in full as well. Which is currently not guaranteed, as the artifact combines are applied only once and in an arbitrary order that has to do with the order of creation or insertion of artifacts into their worklist, which is a no particular order. In this patch I make a small change to the artifact combiner, making it to re-insert into the worklist immediate (modulo a look-through copies) artifact users of each vreg that changes its definition due to an artifact combine. Here the first scan through the artifacts worklist, while not being done in any guaranteed order, only needs to find the innermost pair(s) of artifacts that could be immediately combined out. After that the process follows def-use chains, making them shorter at each step, thus combining everything that can be combined in O(n) time. Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders Reviewed By: aditya_nandakumar, paquette Tags: #llvm Differential Revision: https://reviews.llvm.org/D71448
* [AArch64][GlobalISel] Add support for selection of vector G_SHL with immediates.Amara Emerson2019-12-062-16/+196
| | | | | | Only implemented for the type combinations already supported for G_SHL. Differential Revision: https://reviews.llvm.org/D71153
* [GlobalISel] Fix compiler crash lowering G_LOAD in AArch64.Amara Emerson2019-12-041-0/+22
| | | | | | Patch by Daniel Rodríguez Troitiño. Differential Revision: https://reviews.llvm.org/D70794
* [AArch64] Fix over-eager fusing of NEON SIMD MUL/ADDSanne Wouda2019-12-031-24/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The ISel pattern for SIMD MLA is a bit too eager: it replaces the ADD with an MLA even when the MUL cannot be eliminated, e.g. when it has another use. An MLA is usually has a higher latency than an ADD (and there are fewer pipes available that can execute it), so trading an MLA for an ADD is not great. ISel is not taking the number of uses of the MUL result into account, nor any other factors such as the length of the critical path or other resource pressure. The MachineCombiner is able to make these judgments so this patch ports the ISel pattern for MUL/ADD fusing to the MachineCombiner. Similarly for MUL/SUB -> MLS, as well as the indexed variants. The change has no impact on SPEC CPU© intrate nor fprate. Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70673
* [GlobalISel] CombinerHelper: Fix a bug in matchCombineCopyVolkan Keles2019-12-021-0/+86
| | | | | | | | | | | | | | | | | Summary: When combining COPY instructions, we were replacing the destination registers with the source register without checking register constraints. This patch adds a simple logic to check if the constraints match before replacing registers. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic Reviewed By: aditya_nandakumar Subscribers: rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70616
* [AArch64] Fix MIR test instruction to not have invalid operand.Amara Emerson2019-11-191-2/+2
| | | | In anticipation of an improved verifier in D63973.
* [GISel][CombinerHelper] Add support for scalar type for the result of ↵Quentin Colombet2019-11-151-0/+42
| | | | | | | | | | | | | | | | | | shuffle vector LLVM IR of 1-element vectors get lower into scalar in GISel. As a result, shuffle vector may also produce a scalar. This patch teaches the shuffle combiner how to deal with scalars when they are in the destination type of a shuffle vector. For now, we just support the easy case where this can be lowered to a plain copy. For other cases, we leave the shuffle vector as is. This type of IR are seen in O0 pipelines. E.g., as produced with SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c. rdar://problem/57198904
* [globalisel][irtanslator] The IRTranslator should preserve TBAA informationDaniel Sanders2019-11-141-0/+19
|
* When lowering calls and tail calls in AArch64, the register mask andEric Christopher2019-11-062-2/+2
| | | | | | | | | | return value location depends on the calling convention of the callee. `F.getCallingConv()`, however, is the caller CC. Correct it to the callee CC from `CallLoweringInfo`. Fixes PR43449 Patch by Shu-Chun Weng!
* [GISel][ArtifactCombiner] Relax the constraint to combine unmerge with ↵Quentin Colombet2019-11-062-11/+83
| | | | | | | | | | | | | | | concat_vectors The combine G_UNMERGE_VALUES with G_CONCAT_VECTORS used to only be performed when the result type of the G_UNMERGE_VALUES was a vector type. In other words, we were expecting that the G_UNMERGE_VALUES was effectively the exact opposite of the G_CONCAT_VECTORS. Lift that constraint by allowing any G_UNMERGE_VALUES to be combined with any G_CONCAT_VECTORS (as long as the size of the different pieces that we merge/unmerge match). Differential Revision: https://reviews.llvm.org/D69288
* [globalisel] Rename G_GEP to G_PTR_ADDDaniel Sanders2019-11-0523-195/+195
| | | | | | | | | | | | | | | | Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734
* [GISel][CombinerHelper] Combine shuffle_vector scalar to build_vectorQuentin Colombet2019-10-301-0/+63
| | | | | | | | | | Teach the combiner helper how to replace shuffle_vector of scalars into build_vector. I am not particularly happy about having to add this combine, but we currently get those from <1 x iN> from the IR. Bonus: This fixes an assert in the shuffle_vector combines since before this patch, we were expecting vector types.
* [AArch64][GlobalISel] Fix assertion fail in C++ selection for vector zext of ↵Amara Emerson2019-10-281-0/+9
| | | | | | | | <4 x s8> We bailed out of dealing with vectors only after the assertion, move it before. Fixes PR43794
* [GlobalISel][AArch64][AMDGPU][X86] Teach LegalizationArtifactCombiner to ↵Craig Topper2019-10-241-13/+12
| | | | | | | | combine trunc(g_constant). This allows X86 to properly form shift by immediate instructions since we require an 8-bit constant to match the imported SelectionDAG patterns.
* [GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectorsQuentin Colombet2019-10-211-0/+353
| | | | | | | | | | Teach the CombinerHelper how to turn shuffle_vectors, that concatenate vectors, into concat_vectors and add this combine to the AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69149 llvm-svn: 375452
* [gicombiner] Add the run-time rule disable optionDaniel Sanders2019-10-171-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Each generated helper can be configured to generate an option that disables rules in that helper. This can be used to bisect rulesets. The disable bits are stored in a SparseVector as this is very cheap for the common case where nothing is disabled. It gets more expensive the more rules are disabled but you're generally doing that for debug purposes where performance is less of a concern. Depends on D68426 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68438 llvm-svn: 375067
* [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => ↵Quentin Colombet2019-10-171-0/+141
| | | | | | | | | | | | | build_vector Teach the combiner helper how to flatten concat_vectors of build_vectors into a build_vector. Add this combine as part of AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69071 llvm-svn: 375066
* [DebugInfo] Remove some users of DBG_VALUEs IsIndirect fieldJeremy Morse2019-10-152-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch kills off a significant user of the "IsIndirect" field of DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is techncally redundant as it can be expressed by the DIExpression of a DBG_VALUE inst, and it isn't helpful to have two ways of expressing things. Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref to the insts DIExpression. There should now be no appearences of IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter, which is ensured by an assertion in LDVImpl::handleDebugValue. This means we also get to delete the IsIndirect handling in LiveDebugVariables. Tests can be upgraded by for example swapping the following IsIndirect=True DBG_VALUE: DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo) With one where the indirection is in the DIExpression, by _appending_ a deref: DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref) Which both mean the same thing. Most of the test changes in this patch are updates of that form; also some changes in how the textual assembly printer handles these insts. Differential Revision: https://reviews.llvm.org/D68945 llvm-svn: 374877
* [update_mir_test_checks] Handle MI flags properlyRoman Tereshin2019-10-144-20/+20
| | | | | | | | | | | | | | previously we would generate literal check lines w/ no reg-exps for vregs as MI flags (nsw, ninf, etc.) won't be recognized as a part of MI. Fixing that. Includes updating the MIR tests that suffered from the problem. Reviewed By: bogner Differential Revision: https://reviews.llvm.org/D68905 llvm-svn: 374829
* Reapply r374743 with a fix for the ocaml bindingJoerg Sonnenberger2019-10-141-17/+0
| | | | | | | | | | | | | | | | | | | Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784
* Revert "Add a pass to lower is.constant and objectsize intrinsics"Dmitri Gribenko2019-10-141-0/+17
| | | | | | | This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768
* Add a pass to lower is.constant and objectsize intrinsicsJoerg Sonnenberger2019-10-131-17/+0
| | | | | | | | | | | | | | | | | This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743
* [GISel][CallLowering] Enable vector support in argument loweringQuentin Colombet2019-10-111-0/+22
| | | | | | | | | | The exciting code is actually already enough to handle the splitting of vector arguments but we were lacking a test case. This commit adds a test case for vector argument lowering involving splitting and enable the related support in call lowering. llvm-svn: 374589
* GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sourcesMatt Arsenault2019-10-011-3/+3
| | | | | | Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287
* [AArch64][GlobalISel] Support lowering variadic musttail callsJessica Paquette2019-09-301-0/+223
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for lowering variadic musttail calls. To do this, we have to... - Detect a musttail call in a variadic function before attempting to lower the call's formal arguments. This is done in the IRTranslator. - Compute forwarded registers in `lowerFormalArguments`, and add copies for those registers. - Restore the forwarded registers in `lowerTailCall`. Because there doesn't seem to be any nice way to wrap these up into the outgoing argument handler, the restore code in `lowerTailCall` is done separately. Also, irritatingly, you have to make sure that the registers don't overlap with any passed parameters. Otherwise, the scheduler doesn't know what to do with the extra copies and asserts. Add call-translator-variadic-musttail.ll to test this. This is pretty much the same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to base this off of, but the idea is the same. Differential Revision: https://reviews.llvm.org/D68043 llvm-svn: 373226
* [GlobalISel Enable memcpy inlining with optsize.Amara Emerson2019-09-281-0/+92
| | | | | | We should be disabling inline for minsize, not optsize. llvm-svn: 373143
* Add an operand to memory intrinsics to denote the "tail" marker.Amara Emerson2019-09-287-56/+106
| | | | | | | | | | | | | | We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140
* [AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call loweringJessica Paquette2019-09-251-16/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | When checking for tail call eligibility, we should use the correct CCAssignFn for each argument, rather than just checking if the caller/callee is varargs or not. This is important for tail call lowering with varargs. If we don't check it, then basically any varargs callee with parameters cannot be tail called on Darwin, for one thing. If the parameters are all guaranteed to be in registers, this should be entirely safe. On top of that, not checking for this could potentially make it so that we have the wrong stack offsets when checking for tail call eligibility. Also refactor some of the stuff for CCAssignFnForCall and pull it out into a helper function. Update call-translator-tail-call.ll to show that we can now correctly tail call on Darwin. Also add two extra tail call checks. The first verifies that we still respect the caller's stack size, and the second verifies that we still don't tail call when a varargs function has a memory argument. Differential Revision: https://reviews.llvm.org/D67939 llvm-svn: 372897
* [AArch64][GlobalISel] Tweak legalization rule for G_BSWAP to handle widening ↵Amara Emerson2019-09-251-0/+44
| | | | | | s16. llvm-svn: 372812
* [GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not ↵Amara Emerson2019-09-241-0/+42
| | | | | | | | | | | | unsigned. We were miscompiling switch value comparisons with the wrong signedness, which shows up when we have things like switch case values with i1 types, which end up being legalized incorrectly. Fixes PR43383 llvm-svn: 372675
* [AArch64][GlobalISel] Implement selection for G_SHL of <2 x i64>Amara Emerson2019-09-211-0/+29
| | | | | | Simple continuation of existing selection support. llvm-svn: 372467
* [AArch64][GlobalISel] Selection support for G_ASHR of <2 x s64>Amara Emerson2019-09-211-0/+30
| | | | | | Just add an extra case to the existing selection logic. llvm-svn: 372466
* [AArch64][GlobalISel] Make <4 x s32> G_ASHR and G_LSHR legal.Amara Emerson2019-09-211-0/+78
| | | | llvm-svn: 372465
* [GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time.Amara Emerson2019-09-202-1/+20
| | | | | | | | | | | | | | | | | | | We currently always set the HasCalls on MFI during translation and legalization if we're handling a call or legalizing to a libcall. However, if that call is later optimized to a tail call then we don't need the flag. The flag being set to true causes frame lowering to always save and restore FP/LR, which adds unnecessary code. This change does the same thing as SelectionDAG and ports over some code that scans instructions after selection, using TargetInstrInfo to determine if target opcodes are known calls. Code size geomean improvements on CTMark: -O0 : 0.1% -Os : 0.3% Differential Revision: https://reviews.llvm.org/D67868 llvm-svn: 372443
* Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Matt Arsenault2019-09-191-2/+1
| | | | | | | | | This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338
* Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"Hans Wennborg2019-09-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC*. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_* instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314
* GlobalISel: Don't materialize immarg arguments to intrinsicsMatt Arsenault2019-09-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Encode them directly as an imm argument to G_INTRINSIC*. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_* instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285
* [AArch64][GlobalISel] Support lowering musttail callsJessica Paquette2019-09-182-6/+24
| | | | | | | | | | | | | | | | | | | | | Since we now lower most tail calls, it makes sense to support musttail. Instead of always falling back to SelectionDAG, only fall back when a musttail call was not able to be emitted as a tail call. Once we can handle most incoming and outgoing arguments, we can change this to a `report_fatal_error` like in ISelLowering. Remove the assert that we don't have varargs and a musttail, and replace it with a return false. Implementing this requires that we implement `saveVarArgRegisters` from AArch64ISelLowering, which is an entirely different patch. Add GlobalISel lines to vararg-tallcall.ll to make sure that we produce correct code. Right now we only fall back, but eventually this will be relevant. Differential Revision: https://reviews.llvm.org/D67681 llvm-svn: 372273
OpenPOWER on IntegriCloud