| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ed29dbaafa49bb8c9039a35f768244c394411fea.
I'm backing out D68945, which as the discussion for D73526 shows, doesn't
seem to handle the -O0 path through the codegen backend correctly. I'll
reland the patch when a fix is worked out, apologies for all the churn.
The two parent commits are part of this revert too.
Conflicts:
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
llvm/test/DebugInfo/X86/dbg-addr-dse.ll
SelectionDAGBuilder conflict is due to a nearby change in e39e2b4a79c6
that's technically unrelated. dbg-addr-dse.ll conflicted because
41206b61e30c (legitimately) changes the order of two lines.
There are further modifications to dbg-value-func-arg.ll: it landed after
the patch being reverted, and I've converted indirection to be represented
by the isIndirect field rather than DW_OP_deref.
(cherry picked from commit 6531a78ac4b5b229bce272706593a0bc873877d7)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support @llvm.{return,frame}address selection."
These intrinsics expand to a variable number of instructions so just like in
ISelLowering.cpp we use custom code to deal with them.
Committing Tim's original patch.
Differential Revision: https://reviews.llvm.org/D65656
----
Breaks EXPENSIVE_CHECKS builds.
|
|
|
|
|
|
|
|
|
| |
These intrinsics expand to a variable number of instructions so just like in
ISelLowering.cpp we use custom code to deal with them.
Committing Tim's original patch.
Differential Revision: https://reviews.llvm.org/D65656
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're planning to remove the shufflemask operand from ShuffleVectorInst
(D72467); fix GlobalISel so it doesn't depend on that Constant.
The change to prelegalizercombiner-shuffle-vector.mir happens because
the input contains a literal "-1" in the mask (so the parser/verifier
weren't really handling it properly). We now treat it as equivalent to
"undef" in all contexts.
Differential Revision: https://reviews.llvm.org/D72663
|
|
|
|
| |
number instead of a specific number.
|
|
|
|
|
|
| |
Also requires making G_IMPLICIT_DEF of v2s32 legal.
Differential Revision: https://reviews.llvm.org/D72422
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for selecting a large chunk of the load/store *roW patterns.
This is pretty much a straight port of AArch64DAGToDAGISel::SelectAddrModeWRO
into GISel. The code is very similar to the XRO code. The main difference is
that in the *roW patterns, we want to try and fold in an extend, and *possibly*
a shift along with it. A good portion of this patch is refactoring the existing
XRO code.
- Add selectAddrModeWRO
- Factor out the code from selectAddrModeShiftedExtendXReg which is used by both
selectAddrModeXRO and selectAddrModeWRO into selectExtendedSHL.
This is similar to the function of the same name in AArch64DAGToDAGISel.
- Add support for extends to the factored out code in selectExtendedSHL.
- Teach getExtendTypeForInst how to handle AND masks that are intended to be
used in loads/stores (necessary for this addressing mode.)
- Make getExtendTypeForInst not static because moving it made an annoying diff
and I wanted to have the WRO/XRO functions close to each other while I was
writing the code.
Differential Revision: https://reviews.llvm.org/D72426
|
|
|
|
|
|
|
|
|
|
| |
E.g.
%addr1 = G_PTR_ADD %base, G_CONSTANT 20
%addr2 = G_PTR_ADD %addr1, G_CONSTANT 8
-->
%addr2 = G_PTR_ADD %base, G_CONSTANT 28
Differential Revision: https://reviews.llvm.org/D72351
|
| |
|
|
|
|
| |
as cleanups after D56351
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As the extern_weak target might be missing, resolving to the absolute
address zero, we can't use the normal direct PC-relative branch
instructions (as that would result in relocations out of range).
Improve the classifyGlobalFunctionReference method to set
MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in
AArch64TargetLowering::LowerCall to use the return value from
classifyGlobalFunctionReference for these cases.
Add code in both AArch64FastISel and GlobalISel/IRTranslator to
bail out for function calls to extern weak functions on windows,
to let SelectionDAG handle them.
This matches what was done for X86 in 6bf108d77a3c.
Differential Revision: https://reviews.llvm.org/D71721
|
|
|
|
|
|
|
|
|
|
| |
The change allows clang -mno-omit-leaf-frame-pointer to disable frame
pointer elimination. This behavior matches X86 and Mips, and also GCC
AArch64.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D71168
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Legalization algorithm is complicated by two facts:
1) While regular instructions should be possible to legalize in
an isolated, per-instruction, context-free manner, legalization
artifacts can only be eliminated in pairs, which could be deeply, and
ultimately arbitrary nested: { [ () ] }, where which paranthesis kind
depicts an artifact kind, like extend, unmerge, etc. Such structure
can only be fully eliminated by simple local combines if they are
attempted in a particular order (inside out), or alternatively by
repeated scans each eliminating only one innermost pair, resulting in
O(n^2) complexity.
2) Some artifacts might in fact be regular instructions that could (and
sometimes should) be legalized by the target-specific rules. Which
means failure to eliminate all artifacts on the first iteration is
not a failure, they need to be tried as instructions, which may
produce more artifacts, including the ones that are in fact regular
instructions, resulting in a non-constant number of iterations
required to finish the process.
I trust the recently introduced termination condition (no new artifacts
were created during as-a-regular-instruction-retrial of artifacts not
eliminated on the previous iteration) to be efficient in providing
termination, but only performing the legalization in full if and only if
at each step such chains of artifacts are successfully eliminated in
full as well.
Which is currently not guaranteed, as the artifact combines are applied
only once and in an arbitrary order that has to do with the order of
creation or insertion of artifacts into their worklist, which is a no
particular order.
In this patch I make a small change to the artifact combiner, making it
to re-insert into the worklist immediate (modulo a look-through copies)
artifact users of each vreg that changes its definition due to an
artifact combine.
Here the first scan through the artifacts worklist, while not
being done in any guaranteed order, only needs to find the innermost
pair(s) of artifacts that could be immediately combined out. After that
the process follows def-use chains, making them shorter at each step, thus
combining everything that can be combined in O(n) time.
Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders
Reviewed By: aditya_nandakumar, paquette
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71448
|
|
|
|
|
|
| |
Only implemented for the type combinations already supported for G_SHL.
Differential Revision: https://reviews.llvm.org/D71153
|
|
|
|
|
|
| |
Patch by Daniel Rodríguez Troitiño.
Differential Revision: https://reviews.llvm.org/D70794
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The ISel pattern for SIMD MLA is a bit too eager: it replaces the ADD with an
MLA even when the MUL cannot be eliminated, e.g. when it has another use. An
MLA is usually has a higher latency than an ADD (and there are fewer pipes
available that can execute it), so trading an MLA for an ADD is not great.
ISel is not taking the number of uses of the MUL result into account, nor any
other factors such as the length of the critical path or other resource pressure.
The MachineCombiner is able to make these judgments so this patch ports the ISel
pattern for MUL/ADD fusing to the MachineCombiner.
Similarly for MUL/SUB -> MLS, as well as the indexed variants.
The change has no impact on SPEC CPU© intrate nor fprate.
Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70673
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When combining COPY instructions, we were replacing the destination registers
with the source register without checking register constraints. This patch adds
a simple logic to check if the constraints match before replacing registers.
Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic
Reviewed By: aditya_nandakumar
Subscribers: rovka, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70616
|
|
|
|
| |
In anticipation of an improved verifier in D63973.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shuffle vector
LLVM IR of 1-element vectors get lower into scalar in GISel. As a
result, shuffle vector may also produce a scalar.
This patch teaches the shuffle combiner how to deal with scalars when
they are in the destination type of a shuffle vector.
For now, we just support the easy case where this can be lowered to
a plain copy. For other cases, we leave the shuffle vector as is.
This type of IR are seen in O0 pipelines. E.g., as produced with
SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c.
rdar://problem/57198904
|
| |
|
|
|
|
|
|
|
|
|
|
| |
return value location depends on the calling convention of the callee.
`F.getCallingConv()`, however, is the caller CC. Correct it to the
callee CC from `CallLoweringInfo`.
Fixes PR43449
Patch by Shu-Chun Weng!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
concat_vectors
The combine G_UNMERGE_VALUES with G_CONCAT_VECTORS used to only be performed
when the result type of the G_UNMERGE_VALUES was a vector type.
In other words, we were expecting that the G_UNMERGE_VALUES was effectively
the exact opposite of the G_CONCAT_VECTORS.
Lift that constraint by allowing any G_UNMERGE_VALUES to be combined
with any G_CONCAT_VECTORS (as long as the size of the different pieces
that we merge/unmerge match).
Differential Revision: https://reviews.llvm.org/D69288
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
|
|
|
|
|
|
|
|
|
|
| |
Teach the combiner helper how to replace shuffle_vector of scalars
into build_vector.
I am not particularly happy about having to add this combine, but we
currently get those from <1 x iN> from the IR.
Bonus: This fixes an assert in the shuffle_vector combines since before
this patch, we were expecting vector types.
|
|
|
|
|
|
|
|
| |
<4 x s8>
We bailed out of dealing with vectors only after the assertion, move it before.
Fixes PR43794
|
|
|
|
|
|
|
|
| |
combine trunc(g_constant).
This allows X86 to properly form shift by immediate instructions
since we require an 8-bit constant to match the imported
SelectionDAG patterns.
|
|
|
|
|
|
|
|
|
|
| |
Teach the CombinerHelper how to turn shuffle_vectors, that
concatenate vectors, into concat_vectors and add this combine
to the AArch64 pre-legalizer combiner.
Differential Revision: https://reviews.llvm.org/D69149
llvm-svn: 375452
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Each generated helper can be configured to generate an option that disables
rules in that helper. This can be used to bisect rulesets.
The disable bits are stored in a SparseVector as this is very cheap for the
common case where nothing is disabled. It gets more expensive the more rules
are disabled but you're generally doing that for debug purposes where
performance is less of a concern.
Depends on D68426
Reviewers: volkan, bogner
Reviewed By: volkan
Subscribers: hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68438
llvm-svn: 375067
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
build_vector
Teach the combiner helper how to flatten concat_vectors of build_vectors
into a build_vector.
Add this combine as part of AArch64 pre-legalizer combiner.
Differential Revision: https://reviews.llvm.org/D69071
llvm-svn: 375066
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch kills off a significant user of the "IsIndirect" field of
DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is
techncally redundant as it can be expressed by the DIExpression of a
DBG_VALUE inst, and it isn't helpful to have two ways of expressing
things.
Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref
to the insts DIExpression. There should now be no appearences of
IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter,
which is ensured by an assertion in LDVImpl::handleDebugValue. This means
we also get to delete the IsIndirect handling in LiveDebugVariables. Tests
can be upgraded by for example swapping the following IsIndirect=True
DBG_VALUE:
DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo)
With one where the indirection is in the DIExpression, by _appending_
a deref:
DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref)
Which both mean the same thing.
Most of the test changes in this patch are updates of that form; also some
changes in how the textual assembly printer handles these insts.
Differential Revision: https://reviews.llvm.org/D68945
llvm-svn: 374877
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
previously we would generate literal check lines w/ no reg-exps for
vregs as MI flags (nsw, ninf, etc.) won't be recognized as a part of MI.
Fixing that. Includes updating the MIR tests that suffered from the
problem.
Reviewed By: bogner
Differential Revision: https://reviews.llvm.org/D68905
llvm-svn: 374829
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.
The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.
Differential Revision: https://reviews.llvm.org/D65280
llvm-svn: 374784
|
|
|
|
|
|
|
| |
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218
llvm-svn: 374768
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.
The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.
Differential Revision: https://reviews.llvm.org/D65280
llvm-svn: 374743
|
|
|
|
|
|
|
|
|
|
| |
The exciting code is actually already enough to handle the splitting
of vector arguments but we were lacking a test case.
This commit adds a test case for vector argument lowering involving
splitting and enable the related support in call lowering.
llvm-svn: 374589
|
|
|
|
|
|
| |
Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU.
llvm-svn: 373287
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for lowering variadic musttail calls. To do this, we have
to...
- Detect a musttail call in a variadic function before attempting to lower the
call's formal arguments. This is done in the IRTranslator.
- Compute forwarded registers in `lowerFormalArguments`, and add copies for
those registers.
- Restore the forwarded registers in `lowerTailCall`.
Because there doesn't seem to be any nice way to wrap these up into the outgoing
argument handler, the restore code in `lowerTailCall` is done separately.
Also, irritatingly, you have to make sure that the registers don't overlap with
any passed parameters. Otherwise, the scheduler doesn't know what to do with the
extra copies and asserts.
Add call-translator-variadic-musttail.ll to test this. This is pretty much the
same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to
base this off of, but the idea is the same.
Differential Revision: https://reviews.llvm.org/D68043
llvm-svn: 373226
|
|
|
|
|
|
| |
We should be disabling inline for minsize, not optsize.
llvm-svn: 373143
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to propagate this information from the IR in order to be able to safely
do tail call optimizations on the intrinsics during legalization. Assuming
it's safe to do tail call opt without checking for the marker isn't safe because
the mem libcall may use allocas from the caller.
This adds an extra immediate operand to the end of the intrinsics and fixes the
legalizer to handle it.
Differential Revision: https://reviews.llvm.org/D68151
llvm-svn: 373140
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When checking for tail call eligibility, we should use the correct CCAssignFn
for each argument, rather than just checking if the caller/callee is varargs or
not.
This is important for tail call lowering with varargs. If we don't check it,
then basically any varargs callee with parameters cannot be tail called on
Darwin, for one thing. If the parameters are all guaranteed to be in registers,
this should be entirely safe.
On top of that, not checking for this could potentially make it so that we have
the wrong stack offsets when checking for tail call eligibility.
Also refactor some of the stuff for CCAssignFnForCall and pull it out into a
helper function.
Update call-translator-tail-call.ll to show that we can now correctly tail call
on Darwin. Also add two extra tail call checks. The first verifies that we still
respect the caller's stack size, and the second verifies that we still don't
tail call when a varargs function has a memory argument.
Differential Revision: https://reviews.llvm.org/D67939
llvm-svn: 372897
|
|
|
|
|
|
| |
s16.
llvm-svn: 372812
|
|
|
|
|
|
|
|
|
|
|
|
| |
unsigned.
We were miscompiling switch value comparisons with the wrong signedness, which
shows up when we have things like switch case values with i1 types, which end up
being legalized incorrectly.
Fixes PR43383
llvm-svn: 372675
|
|
|
|
|
|
| |
Simple continuation of existing selection support.
llvm-svn: 372467
|
|
|
|
|
|
| |
Just add an extra case to the existing selection logic.
llvm-svn: 372466
|
|
|
|
| |
llvm-svn: 372465
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently always set the HasCalls on MFI during translation and legalization if
we're handling a call or legalizing to a libcall. However, if that call is later
optimized to a tail call then we don't need the flag. The flag being set to true
causes frame lowering to always save and restore FP/LR, which adds unnecessary code.
This change does the same thing as SelectionDAG and ports over some code that scans
instructions after selection, using TargetInstrInfo to determine if target opcodes
are known calls.
Code size geomean improvements on CTMark:
-O0 : 0.1%
-Os : 0.3%
Differential Revision: https://reviews.llvm.org/D67868
llvm-svn: 372443
|
|
|
|
|
|
|
|
|
| |
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)
This was missing one switch to getTargetConstant in an untested case.
llvm-svn: 372338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This broke the Chromium build, causing it to fail with e.g.
fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>
See llvm-commits thread of r372285 for details.
This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.
> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372314
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Encode them directly as an imm argument to G_INTRINSIC*.
Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.
This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.
SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.
Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.
The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.
This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.
llvm-svn: 372285
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we now lower most tail calls, it makes sense to support musttail.
Instead of always falling back to SelectionDAG, only fall back when a musttail
call was not able to be emitted as a tail call. Once we can handle most
incoming and outgoing arguments, we can change this to a `report_fatal_error`
like in ISelLowering.
Remove the assert that we don't have varargs and a musttail, and replace it
with a return false. Implementing this requires that we implement
`saveVarArgRegisters` from AArch64ISelLowering, which is an entirely different
patch.
Add GlobalISel lines to vararg-tallcall.ll to make sure that we produce correct
code. Right now we only fall back, but eventually this will be relevant.
Differential Revision: https://reviews.llvm.org/D67681
llvm-svn: 372273
|