| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
PowerPC hits an assertion due to somewhat the same reason as https://reviews.llvm.org/D70975.
Though there are already some hack, it still failed with some case, when the operand 0 is NOT
a const fp, it is another fma that with const fp. And that const fp is negated which result in multi-uses.
A better fix is to check the uses of the negated const fp. If there are already use of its negated
value, we will have benefit as no extra Node is added.
Differential revision: https://reviews.llvm.org/D75501
(cherry picked from commit 3906ae387f0775dfe4426e4336748269fafbd190)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the comparison argument used for ATOMIC_CMP_XCHG is legalised
with GetPromotedInteger, which leaves the upper bits of the value
undefind. Since this is used for comparing in an LR/SC loop with a
full-width comparison, we must sign extend it on RISC-V.
This is related to https://reviews.llvm.org/D58829, which solved the
issue for ATOMIC_CMP_SWAP_WITH_SUCCESS, but not the simpler
ATOMIC_CMP_SWAP.
This patch is a modified form of
616289ed29225c0ddfe5699c7fdf42a0fcbe0ab4 by Jessica Clarke. It localises
the changes to LegalizeIntegerTypes and avoids adding a new virtual
method to TargetLowering to avoid changing the ABI of libLLVM.so.
|
|
|
|
|
|
|
|
| |
RDF is designed to be target agnostic. Therefore it would be useful to have it available for other targets, such as X86.
Based on a previous patch by Krzysztof Parzyszek
Differential Revision: https://reviews.llvm.org/D75932
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Likely fixes https://bugs.llvm.org/show_bug.cgi?id=45858.
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D80047
(cherry picked from commit fc937806efd71eb3803b35d0920bb7d76bc3040b)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BreakPHIEdge would be set based on whether the instruction needs to
insert a new critical edge to allow sinking into a block where the uses
are PHI nodes. But for instructions with multiple defs it would be reset
on the second def, allowing the instruciton to sink where it should not.
Fixes PR44981
Differential Revision: https://reviews.llvm.org/D78087
(cherry picked from commit 44c4ba34d001dcf538d7396007b5611d6f697f86)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In DAGCombiner::visitLOAD() we perform some checks before breaking up an indexed
load. However, we don't do the same checking in ForwardStoreValueToDirectLoad()
which can lead to failures later during combining
(see: https://bugs.llvm.org/show_bug.cgi?id=45301).
This patch just adds the same checks to this function as well.
Fixes: https://bugs.llvm.org/show_bug.cgi?id=45301
Differential revision: https://reviews.llvm.org/D76778
(cherry picked from commit 482141134729237072cb94248381dab96ce34374)
|
|
|
|
|
|
|
|
|
|
| |
There was already a test case for landingpads to handle this case, but I
had forgotten to consider PHI instructions preceding the EH_LABEL in the
landingpad.
PR45261
(cherry picked from commit e5bf5037d869c74bc2faf81fa1f58dfd827e8356)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to Joseph Myers, a libm maintainer
> They were only ever an ABI (selected by use of -ffinite-math-only or
> options implying it, which resulted in the headers using "asm" to redirect
> calls to some libm functions), not an API. The change means that ABI has
> turned into compat symbols (only available for existing binaries, not for
> anything newly linked, not included in static libm at all, not included in
> shared libm for future glibc ports such as RV32), so, yes, in any case
> where tools generate direct calls to those functions (rather than just
> following the "asm" annotations on function declarations in the headers),
> they need to stop doing so.
As a consequence, we should no longer assume these symbols are available on the
target system.
Still keep the TargetLibraryInfo for constant folding.
Differential Revision: https://reviews.llvm.org/D74712
(cherry picked from commit 6d15c4deab51498b70825fb6cefbbfe8f3d9bdcf)
For https://bugs.llvm.org/show_bug.cgi?id=45034
|
|
|
|
|
|
|
|
|
|
| |
This reverts https://reviews.llvm.org/D58468
(rL354676, 44037d7a6377ec8e5542cced73583283334b516b),
and all and any follow-ups to that code block.
https://bugs.llvm.org/show_bug.cgi?id=43446
(cherry picked from commit d20907d1de89bf63b589fadd8c096d4895e47fba)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When using strict fp, it is required to update the
chain when performing integer type promotion of a
operand to a integer to floating point conversion.
Reviewers: craig.topper, john.brawn
Reviewed By: craig.topper
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74597
(cherry picked from commit 8bc790f9e6a6fc6d8fe8f41a7120269366fa0957)
|
|
|
|
|
|
| |
By Clement Courbet!
Backported from rG15488ff24b4a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ed29dbaafa49bb8c9039a35f768244c394411fea.
I'm backing out D68945, which as the discussion for D73526 shows, doesn't
seem to handle the -O0 path through the codegen backend correctly. I'll
reland the patch when a fix is worked out, apologies for all the churn.
The two parent commits are part of this revert too.
Conflicts:
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
llvm/test/DebugInfo/X86/dbg-addr-dse.ll
SelectionDAGBuilder conflict is due to a nearby change in e39e2b4a79c6
that's technically unrelated. dbg-addr-dse.ll conflicted because
41206b61e30c (legitimately) changes the order of two lines.
There are further modifications to dbg-value-func-arg.ll: it landed after
the patch being reverted, and I've converted indirection to be represented
by the isIndirect field rather than DW_OP_deref.
(cherry picked from commit 6531a78ac4b5b229bce272706593a0bc873877d7)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 3137fe4d23eeb8df08c03e9111465325eeafe08e.
I'm backing out D68945, which this patch is a follow up for. It'll be
re-landed when D68945 is fixed.
The changes to dbg-value-func-arg.ll occur because our handling of certain
kinds of location now mixes up indirection that happens at different points
in a DIExpression. While this is a regression, it's a return to the prior
behaviour while a better patch is sought.
(cherry picked from commit ece761427f63de96ee52bbd6be1c61b07967a917)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reverts part of r362750 / D62650, which stopped
LiveDebugVariables from trimming leading variable location ranges down
to only covering those instructions that are in scope. I've observed some
circumstances where the number of DBG_VALUEs in a function can be
amplified in an un-necessary way, to cover more instructions that are
out of scope, leading to very slow compile times. Trimming the range
of instructions that the variables cover solves the slow compile times.
The specific problem that r362750 tries to fix is addressed by the
assignment to RStart that I've added. Any variable location that begins
at the first instruction of a block will now be considered to begin at the
start of the block. While these sound the same, the have different
SlotIndexes, and the register allocator may shoehorn additional
instructions in between the two. The test added in the past
(wrong_debug_loc_after_regalloc.ll) still works with this modification.
live-debug-variables.ll has a range trimmed to not cover the prologue of
the function, while dbg-addr-dse.ll has a DBG_VALUE sink past one
instruction with no DebugLoc, which is expected behaviour.
Differential Revision: https://reviews.llvm.org/D73691
(cherry picked from commit 41206b61e30c3d84188cb17b91c2c0c800982dd1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
ARM Type Promotion pass does not clear
the container that defines if one variable
was visited or not, missing optimization
opportunities by luck when two llvm:Values
from different functions are allocated at
the same memory address.
Also fixes a comment and uses existing
method to pop and obtain last element
of the worklist.
Reviewers: samparker
Reviewed By: samparker
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73970
(cherry picked from commit 8ba2b6281075c65c1a47abed57810e1201942533)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under MVE, we do not have any lowering for fminimum, which a
vector_reduce_fmin without NoNan will be expanded into. As with the
other recent patches, force this to expand in the pre-isel pass. Note
that Neon lowering would be OK because the scalar fminimum uses the
vector VMIN instruction, but is probably better to just rely on the
scalar operations, which is what is done here.
Also fixes what appears to be the reversal of INF vs -INF in the
vector_reduce_fmin widening code.
(cherry picked from commit 362d00e0510ee75750499e2993a782428e377215)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after
9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after
the initial BTI.
```
.Lfunc_begin0:
bti c
nop
nop
.section __patchable_function_entries,"awo",@progbits,f,unique,0
.p2align 3
.xword .Lfunc_begin0
```
This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label:
```
.Lfunc_begin0:
bti c
.Lpatch0:
nop
nop
.section __patchable_function_entries,"awo",@progbits,f,unique,0
.p2align 3
.xword .Lpatch0
```
This placement is compatible with the resolution in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 .
A local linkage function whose address is not taken does not need a BTI.
Placing the patch label after BTI has the advantage that code does not
need to differentiate whether the function has an initial BTI.
Reviewers: mrutland, nickdesaulniers, nsz, ostannard
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73680
(cherry picked from commit 06b8e32d4fd3f634f793e3bc0bc4fdb885e7a3ac)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... as well as:
Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info"
This reverts commit fa4701e1979553c2df61698ac1ac212627630442.
This reverts commit 79daafc90308787b52a5d3a7586e82acd5e374b3.
There have been reports of this assert getting hit:
CalleeDIE && "Could not find DIE for call site entry origin
(cherry picked from commit 802bec896171997a7b73dde3857712e0eedeabc1)
|
|
|
|
|
|
|
|
|
|
|
| |
Symbols created for merged external global variables have default
visibility. This can break programs when compiling with -Oz
-fvisibility=hidden as symbols that should be hidden will be exported at
link time.
Differential Revision: https://reviews.llvm.org/D73235
(cherry picked from commit a2fb2c0ddca14c133f24d08af4a78b6a3d612ec6)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineFunction::getPSVManager()""
Reland 7a8b0b1595e7dc878b48cf9bbaa652087a6895db, with a fix that checks
`!E.value().empty()` to avoid inserting a zero to SlotRemap.
Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D73510
(cherry picked from commit 68051c122440b556e88a946bce12bae58fcfccb4)
(cherry picked from commit c7c5da6df30141c563e1f5b8ddeabeecdd29e55e)
|
|
|
|
|
|
|
|
| |
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D73301
(cherry picked from commit 50a3ff30e1587235d1830fec9694c1239302ab9f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-fpatchable-function-entry=N,M where M>0
Similar to the function attribute `prefix` (prefix data),
"patchable-function-prefix" inserts data (M NOPs) before the function
entry label.
-fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry)
will look like:
```
.type foo,@function
.Ltmp0: # @foo
nop
foo:
.Lfunc_begin0:
# optional `bti c` (AArch64 Branch Target Identification) or
# `endbr64` (Intel Indirect Branch Tracking)
nop
.section __patchable_function_entries,"awo",@progbits,get,unique,0
.p2align 3
.quad .Ltmp0
```
-fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable
placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html):
```
(a) (b)
func: func:
.Ltmp0: bti c
bti c .Ltmp0:
nop nop
```
(a) needs no additional code. If the consensus is to go for (b), we will
need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp .
Differential Revision: https://reviews.llvm.org/D73070
(cherry picked from commit 22467e259507f5ead2a87d989251b4c951a587e4)
|
|
|
|
|
|
|
|
| |
"patchable-function-entry"="0"
Add improve tests
(cherry picked from commit d232c215669cb57f5eb4ead40a4a336220dbc429)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
before addPreEmitPass()
This intention is to move patchable-function before aarch64-branch-targets
(configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs
(see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424).
This also allows addPreEmitPass() passes to know the precise instruction sizes if they want.
Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1.
No output difference with this commit and the previous commit.
(cherry picked from commit 9a24488cb67a90f889529987275c5e411ce01dda)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Without the BFI update, some hot blocks are incorrectly treated as cold code.
This fixes a FDO perf regression in the TSVC benchmark from D71288.
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73146
(cherry picked from commit ddbc728828c70728473b47c9f7427aa9514f3d17)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MachineMemOperand
StackColoring::remapInstructions() remaps MachineOperand frame index (e.g. %stack.1 -> %stack.0)
but does not remap FixedStackPseudoSourceValue frame index (e.g. store 4 into %stack.1.ap2.i.i)
referenced by MachineMemoryOperand.
This can cause an assertion failure when LiveDebugValues references a dead stack object.
It is difficult to craft a test case. -g, va_copy and stack-coloring are required.
I can only reproduce it on ppc32.
(cherry picked from commit eaab1bf21e1d6803fd217fe6052537fc33b06837)
(cherry picked from commit 854f7be20a0cb1a95671a16d6cc8200107ee25f4)
(cherry picked from commit 7a8b0b1595e7dc878b48cf9bbaa652087a6895db)
|
|
|
|
| |
Fixes "pointer is null" clang static analyzer warning.
|
|
|
|
|
|
|
|
| |
ScheduleDAGMI. NFC
All the callers of this function will be ScheduleDAGMI from the
MachineScheduler. This allows us to use the extra info available in
ScheduleDAGMI without resorting to awkward casts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- `dead-mi-elimination` assumes MIR in the SSA form and cannot be
arranged after phi elimination or DeSSA. It's enhanced to handle the
dead register definition by skipping use check on it. Once a register
def is `dead`, all its uses, if any, should be `undef`.
- Re-arrange the DIE in RA phase for AMDGPU by placing it directly after
`detect-dead-lanes`.
- Many relevant tests are refined due to different register assignment.
Reviewers: rampitec, qcolombet, sunfish
Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72709
|
|
|
|
|
|
| |
- Prefer `getVectorIdxTy()` as the index operand type for
`EXTRACT_SUBVECTOR` as targets expect different types by overloading
`getVectorIdxTy()`.
|
|
|
|
|
|
| |
This code is untested in tree because the "APFloat::semanticsPrecision(sem) >= SrcVT.getSizeInBits() - 1" check is false for most combinations for int and fp types except maybe i32 and f64. For that you would need i32 to be an illegal type, but f64 to be legal and have custom handling for legalizing the split sint_to_fp. The precision check itself was added in 2010 to fix a double rounding issue in the algorithm that would occur if the sint_to_fp was not able to do the conversion without rounding.
Differential Revision: https://reviews.llvm.org/D72728
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mem op clustering adds a weak edge in the DAG between two loads or
stores that should be clustered, but the direction of this edge is
pretty arbitrary (it depends on the sort order of MemOpInfo, which
represents the operands of a load or store). This often means that two
loads or stores will get reordered even if they would naturally have
been scheduled together anyway, which leads to test case churn and goes
against the scheduler's "do no harm" philosophy.
The fix makes sure that the direction of the edge always matches the
original code order of the instructions.
Reviewers: atrick, MatzeB, arsenm, rampitec, t.p.northover
Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72706
|
|
|
|
|
|
|
|
|
|
| |
SUMMARY:
In this patch we put the global variable in a Csect which's SectionKind is "ReadOnlyWithRel" into Data Section.
Reviewers: hubert.reinterpretcast,jasonliu,Xiangling_L
Subscribers: wuzish, nemanjai, hiraditya
Differential Revision: https://reviews.llvm.org/D72461
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code in getRoot made the assumption that every node in PendingLoads
must always itself have a dependency on the current DAG root node.
After the changes in 04a8696, it turns out that this assumption no
longer holds true, causing wrong codegen in some cases (e.g. stores
after constrained FP intrinsics might get deleted).
To fix this, we now need to make sure that the TokenFactor created
by getRoot always includes the previous root, if there is no implicit
dependency already present.
The original getControlRoot code already has exactly this check,
so this patch simply reuses that code now for getRoot as well.
This fixes the regression.
NFC if no constrained FP intrinsic is present.
|
| |
|
| |
|
|
|
|
|
|
| |
and generic ISD::SHL handling.
As mentioned by @nikic on rGef5debac4302, we can merge the guaranteed bottom zero bits from the shifted value, and then, if a min shift amount is known, zero out the bottom bits as well.
|
|
|
|
|
|
|
|
| |
and generic ISD::SRL handling.
As mentioned by @nikic on rGef5debac4302 (although that was just about SHL), we can merge the guaranteed top zero bits from the shifted value, and then, if a min shift amount is known, zero out the top bits as well.
SHL tests / handling will be added in a follow up patch.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're planning to remove the shufflemask operand from ShuffleVectorInst
(D72467); fix GlobalISel so it doesn't depend on that Constant.
The change to prelegalizercombiner-shuffle-vector.mir happens because
the input contains a literal "-1" in the mask (so the parser/verifier
weren't really handling it properly). We now treat it as equivalent to
"undef" in all contexts.
Differential Revision: https://reviews.llvm.org/D72663
|
|
|
|
|
|
|
| |
return type for C++ member functions."
This reverts commit c958639098a8702b831952b1a1a677ae19190a55, which
causes a crash. See https://reviews.llvm.org/D70524 for details.
|
|
|
|
|
|
|
|
| |
STRICT_SINT_TO_FP/STRICT_UINT_TO_FP into a libcall.
Needed to support i128->fp128 on 32-bit X86.
Add full set of strict sint_to_fp/uint_to_fp conversion tests for fp128.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
be15dfa88fb1 broke GlobalISel's usage of getSetCCInverse() which currently
appears to be limited to our out-of-tree backend. GlobalISel doesn't use
EVT's and isn't able to derive them from the information it has as it
doesn't distinguish between integer and floating point types (that
distinction is made by operations rather than values). Bring back the
bool version of getSetCCInverse() in a way that doesn't break the intent
of be15dfa88fb1 but also allows GlobalISel to continue using it.
Reviewers: spatel, bogner, arichardson
Reviewed By: arichardson
Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D72309
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes it so that cases where multiple instructions that differ only
in their FrameIndex MachineOperand values no longer collide. For instance:
%1:_(p0) = G_FRAME_INDEX %stack.0
%2:_(p0) = G_FRAME_INDEX %stack.1
Prior to this patch these instructions would collide together.
Differential Revision: https://reviews.llvm.org/D71583
|
|
|
|
|
|
| |
for ISD::SHL support
Allows us to handle non-uniform SHL shifts to determine the minimum number of sign bits remaining (based off the maximum shift amount value)
|
|
|
|
|
|
|
|
|
| |
STRICT_SINT_TO_FP/STRICT_UINT_TO_FP
Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option.
This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP.
Differential Revision: https://reviews.llvm.org/D72277
|
|
|
|
|
|
| |
ISD::SRA support
Allows us to handle more non-uniform SRA sign bits cases
|
| |
|
|
|
|
|
|
| |
shift opcodes
getValidShiftAmountConstant handles out of bounds shift amounts for us, allowing us to remove the local handling.
|
|
|
|
| |
getValidShiftAmountConstant/getValidMinimumShiftAmountConstant()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to ensure that fpexcept.strict nodes are not optimized away even if
the result is unused. To do that, we need to chain them into the block's
terminator nodes, like already done for PendingExcepts.
This patch adds two new lists of pending chains, PendingConstrainedFP and
PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without
and with fpexcept.strict markers. This allows not only to solve the above
problem, but also to relax chains a bit further by no longer flushing all
FP nodes before a store or other memory access. (They are still flushed
before nodes with other side effects.)
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D72341
|