| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
Current implementation of BaseMemOpsClusterMutation is a little bit
obscure. This patch directly uses a map from store chain ID to set of
memory instrs to make it simpler, so that future improvements are easier
to read, update and review.
Reviewed By: evandro
Differential Revision: https://reviews.llvm.org/D72070
|
|
|
|
|
|
|
|
|
|
|
| |
STRICT_FSETCC/STRICT_FSETCCS nodes.
This causes the STRICT_FSETCC/STRICT_FSETCCS nodes to lowered
early while lowering SELECT, but the output chain doesn't get
connected. Then we visit the node again when it is its turn
because we haven't replaced the use of the chain result. In the
case of the fp128 libcall lowering, after D72341 this will cause
the libcall to be emitted twice.
|
|
|
|
|
|
| |
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D72436
|
|
|
|
|
|
|
| |
legalization.
The lo and hi computation are independent. Give them the same input
chain and TokenFactor the results together.
|
|
|
|
| |
TargetLowering::expandFP_TO_UINT and X86TargetLowering::FP_TO_INTHelper.
|
|
|
|
|
|
| |
Custom legalization can produce MERGE_VALUES to return multiple
results. We can expand them immediately instead of leaving them
around for DAG combine to clean up.
|
|
|
|
|
|
|
|
|
| |
In x86Disassembler{OneByte,TwoByte,...}Codes,
"/* EmptyTable */" is very common. Omitting it saves lots of space.
Also, there is no need to display a table entry in multiple lines.
It is also common that the whole OpcodeDecision is { MODRM_ONEENTRY, 0}.
Make use of zero-initialization.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The argument is llvm::null() everywhere except llvm::errs() in
llvm-objdump in -DLLVM_ENABLE_ASSERTIONS=On builds. It is used by no
target but X86 in -DLLVM_ENABLE_ASSERTIONS=On builds.
If we ever have the needs to add verbose log to disassemblers, we can
record log with a member function, instead of passing it around as an
argument.
|
|
|
|
|
|
|
|
| |
This fixes an off-by-one error in the argc value computed by runAsMain, and
switches lli back to using the input bitcode (rather than the string "lli") as
the effective program name.
Thanks to Stefan Graenitz for spotting the bug.
|
|
|
|
|
|
|
|
|
| |
CrashRecoveryContext fails
This patch allows for handling a failure inside a CrashRecoveryContext in the same way as the global exception/signal handler. A failure will have the same side-effect, such as cleanup of temporarty file, printing callstack, calling relevant signal handlers, and finally returning an exception code. This is an optional feature, disabled by default.
This is a support patch for D69825.
Differential Revision: https://reviews.llvm.org/D70568
|
|
|
|
|
|
|
| |
llvm-objdump -d on clang is decreased from 7.8s to 7.4s.
The improvement is likely due to the elimination of logger setup and
dbgprintf(), which has a large overhead.
|
|
|
|
|
|
|
|
|
|
|
| |
vector to a couple. NFCI
Some of the simplest handlers just call TLI and if that fails,
they fall back to unrolling. For those just inline the TLI call
and share the unrolling call with the default case of Expand.
For ExpandFSUB and ExpandBITREVERSE so that its obvious they
don't return results sometimes and want to defer to LegalizeDAG.
|
|
|
|
|
|
|
| |
and Promote* methods.
All the Expand* and Promote* function assume they are being
called with result 0 anyway. Just hardcode result 0 into them.
|
|
|
|
| |
llvm-objdump -d on clang is decreased from 8.2s to 7.8s.
|
|
|
|
| |
during PreprocessISelDAG to remove some duplicate isel patterns.
|
|
|
|
| |
This reflects the recent changes done.
|
|
|
|
|
|
| |
Patch by Nicolas Capens. Thanks Nicolas!
https://reviews.llvm.org/D65015
|
|
|
|
|
|
|
|
| |
patch"
This reverts commit 4c48ea68e491cb42f1b5d43ffba89f6a7f0dadc4.
The powerpc buildbots had an internal compiler error after this patch.
This requires some inspection.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The analysis for const-ness of local variables required a view generally useful
matchers that are extracted into its own patch.
They are `decompositionDecl` and `forEachArgumentWithParamType`, that works
for calls through function pointers as well.
Reviewers: aaron.ballman
Reviewed By: aaron.ballman
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D72505
|
|
|
|
|
|
|
|
|
|
| |
properly value-typed.
Summary: These were temporary methods used to simplify the transition.
Reviewed By: antiagainst
Differential Revision: https://reviews.llvm.org/D72548
|
|
|
|
|
|
| |
The primary motivation of this change is to bring the code more closely in sync behavior wise with the assembler's version of nop emission. I'd like to eventually factor them into one, but that's hard to do when one has features the other doesn't.
The longest encodeable nop on x86 is 15 bytes, but many processors - for instance all intel chips - can't decode the 15 byte form efficiently. On those processors, it's better to use either a 10 byte or 11 byte sequence depending.
|
| |
|
| |
|
|
|
|
| |
Shadow variable names meant we were referencing the Buffer input argument, not the GlobalModuleIndex member that its std::move()'d it.
|
|
|
|
| |
Use castAs<> instead of getAs<> since the pointers are dereferenced immediately and castAs will perform the null assertion for us.
|
|
|
|
| |
Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately).
|
|
|
|
| |
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately within mangleCallingConvention and castAs will perform the null assertion for us.
|
|
|
|
| |
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
|
|
|
|
| |
Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
|
| |
|
|
|
|
| |
Those do nothing but make the type no longer trivial to the compiler.
|
|
|
|
| |
Fixes static-analyzer warnings.
|
|
|
|
| |
The generic saturated math opcodes are no longer widened inside X86TargetLowering
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No longer generate a diagnostic when a small trivially copyable type is
used without a reference. Before the test looked for a POD type and had no
size restriction. Since the range-based for loop is only available in
C++11 and POD types are trivially copyable in C++11 it's not required to
test for a POD type.
Since copying a large object will be expensive its size has been
restricted. 64 bytes is a common size of a cache line and if the object is
aligned the copy will be cheap. No performance impact testing has been
done.
Differential Revision: https://reviews.llvm.org/D72212
|
| |
|
|
|
|
|
|
|
|
| |
Add initial support for lowering v4f64 shuffles to SHUFPD(VPERM2F128(V1, V2), VPERM2F128(V1, V2)), eventually this could be used for v8f32 (and maybe v8f64/v16f32) but I'm being conservative for the initial implementation as only v4f64 can always succeed.
This currently is only called from lowerShuffleAsLanePermuteAndShuffle so only gets used for unary shuffles, and we limit this to cases where we use upper elements as otherwise concating 2 xmm shuffles is probably the better case.
Helps with poor shuffles mentioned in D66004.
|
|
|
|
| |
Additional test cases for D72524.
|
| |
|
|
|
|
| |
For D72420.
|
|
|
|
| |
For D72519.
|
|
|
|
|
|
|
|
|
| |
Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the
nuw on sub of geps. We only do this if the offset has a multiplication
as the final operation, as we can't be sure the operations is nuw
in the other cases without more thorough analysis.
Differential Revision: https://reviews.llvm.org/D72048
|
|
|
|
| |
now that we don't mutate strict fp nodes. NFC
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: dblaikie, aprantl, davide, JDevlieghere
Reviewed By: aprantl
Subscribers: jmorse, aprantl, merge_guards_bot, mgorny, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72321
|
|
|
|
|
|
| |
For X87<->SSE conversions, the SSE type is always smaller than
the X87 type. So we can always use the smallest type for the
memory type.
|
|
|
|
|
|
|
|
|
|
|
| |
strict_fp_round into stack operations.
We use the stack for X87 fp_round and for moving from SSE f32/f64 to
X87 f64/f80. Or from X87 f64/f80 to SSE f32/f64.
Note for the SSE<->X87 conversions the conversion always happens in the
X87 domain. The load/store ops in the X87 instructions are able
to signal exceptions.
|
| |
|
|
|
|
| |
simplify some code. NFCI
|
|
|
|
|
| |
With plugins and examples enabled, this XPASSes. Mark it as unsupported until
the owner investigates what's going on.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- SI Whole Quad Mode phase is replacing WQM pseudo instructions with v_mov instructions.
While this is necessary for the special handling of moving results out of WWM live ranges,
it is not necessary for WQM live ranges. The result is a v_mov from a register to itself after every
WQM operation. This change uses a COPY psuedo in these cases, which allows the register
allocator to coalesce the moves away.
Reviewers: tpr, dstuttard, foad, nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71386
|