| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac.
This seems to break buildbots.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Sample profile loader of AutoFDO tries to replay previous inlining using context sensitive profile. The replay only repeats inlining if the call site block is hot. As a result it punts inlining of small functions, some of which can be beneficial for size, and will still be inlined by CSGCC inliner later. The oscillation between sample profile loader's inlining and regular CGSSC inlining cause unnecessary loss of context-sensitive profile. It doesn't have much impact for inline decision itself, but it negatively affects post-inline profile quality as CGSCC inliner have to scale counts which is not as accurate as the original context sensitive profile, and bad post-inline profile can misguide code layout.
This change added regular Inline Cost calculation for sample profile loader, so we can inline small functions upfront under switch -sample-profile-inline-size. In addition -sample-profile-cold-inline-threshold is added so we can tune the separate size threshold - currently the default is chosen to be the same as regular inliner's cold call-site threshold.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70750
|
|
|
|
|
|
|
|
|
|
|
| |
GCC says:
.../llvm/lib/DebugInfo/GSYM/FunctionInfo.cpp:195:12:
error: ‘InfoType’ is not a class, namespace, or enumeration
case InfoType::EndOfList:
^
Presumably, GCC thinks InfoType is a variable here. Work around it by
using the name IT as is done above.
|
|
|
|
|
| |
This reverts commit db5739658467e20a52f20e769d3580412e13ff87.
At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.
|
|
|
|
|
| |
This reverts commit 7250ef3613cc6b81145b9543bafb86d7f9466cde.
At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.
|
|
|
|
|
| |
This reverts commit 8bf8ef7116bd0daec570b35480ca969b74e66c6e.
At least 1 of these supposedly NFC commits wasn't - sanitizer bot is angry.
|
|
|
|
|
|
|
|
| |
This reverts commit a7d90af1be48234ce583e00fb16e33633d44ae38.
This revision came back as the root-cause for crashes in internal
ARM-IOS apps.
Reproducer in https://bugs.llvm.org/show_bug.cgi?id=44231.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up to D70607 where we made any
extract element on SLM more costly than default. But that is
pessimistic for extract from element 0 because that corresponds
to x86 movd/movq instructions. These generally have >1 cycle
latency, but they are probably implemented as single uop
instructions.
Note that no vectorization tests are affected by this change.
Also, no targets besides SLM are affected because those are
falling through to the default cost of 1 anyway. But this will
become visible/important if we add more specializations via cost
tables.
Differential Revision: https://reviews.llvm.org/D71023
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Split off of D67120.
Add the profile guided size optimization instrumentation / queries in the code
gen or target passes. This doesn't enable the size optimizations in those passes
yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass
queries).
Reviewers: davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71072
|
| |
|
|
|
|
| |
CreateIntCast returns the input if its type matches, so need to duplicate that check.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead.
To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks:
make sure there is at least one duplication in current work set.
the number of duplication should not exceed the number of successors.
The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain.
Differential Revision: https://reviews.llvm.org/D64376
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
any data into sections
SUMMARY:
if the size of Csect is zero, the Csect do not need write any data into sections
for example, the TOC Csect has zero size, it do not need invoke a
Asm.writeSectionData(W.OS, Csect.MCCsect, Layout);
Reviewers: daltenty
Subscribers: rupprecht, seiyai,hiraditya
Differential Revision: https://reviews.llvm.org/D71120
|
|
|
|
|
|
|
|
|
|
|
|
| |
SUMMARY:
There is warning when compile the file XCOFFObjectWriter.cpp
/srv/llvm-buildbot-srcatch/llvm-build-dir/openmp-gcc-x86_64-linux-debian/llvm.src/llvm/lib/MC/XCOFFObjectWriter.cpp:414:17: warning: 'char* strncpy(char*, const char*, size_t)' specified bound 8 equals destination size [-Wstringop-truncation]
The patch fixed the warning.
Reviewer: daltenty
Differential Revision: https://reviews.llvm.org/D71119
|
|
|
|
| |
This fixes a test failure in test/CodeGen/ARM/fp-intrinsics.ll.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The immediate forms of the MVE VQSHL instruction have MC names like
`MVE_VSLIimms8` and `MVE_VSLIimmu32`. Those names are confusing,
because VSLI is a completely different shift instruction with no
semantic relation to VQSHL. But it just happens to be defined
immediately before VQSHL in `ARMInstrMVE.td`, so this looks like a
copy-paste error. Renamed the ids to match the instruction name.
Reviewers: ostannard, dmgreen, MarkMurrayARM, miyuki
Reviewed By: miyuki
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71114
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When trying to calculate the offsets for the jump table entries
we fail to take into account the block alignment, which could be
greater than 4 bytes. This led to cases where the jump table
offset was too big to fit in a byte.
Reviewers: t.p.northover, sdesmalen, ostannard
Reviewed By: ostannard
Subscribers: ostannard, kristof.beyls, hiraditya, llvm-commits
Committed on behalf of David Sherwood (david-arm)
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70533
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InnerLoopVectorizer's code called during VPlan execution still relies on
original IR's def-use relations to decide which vector code to generate,
limiting VPlan transformations ability to modify def-use relations and still
have ILV generate the vector code.
This commit moves GEP operand queries controlling how GEPs are widened to a
dedicated recipe and extracts GEP widening code to its own ILV method taking
those recorded decisions as arguments. This reduces ingredient def-use usage by
ILV as a step towards full VPlan-based def-use relations.
Differential revision: https://reviews.llvm.org/D69067
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds the following intrinsics:
* whilege, whilegt, whilehi, whilehs
Reviewers: sdesmalen, rovka, dancgr, efriedma, rengolin, huntergr
Reviewed By: sdesmalen
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70909
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One of CodeGenPrepare's optimizations is to duplicate address calculations
into basic blocks, so that as much information as possible can be folded
into memory addressing operands. This is great -- but the dbg.value
variable location intrinsics are not updated in the same way. This can lead
to dbg.values referring to address computations in other blocks that will
never be encoded into the DAG, while duplicate address computations are
performed locally that could be used by the dbg.value. Some of these (such
as non-constant-offset GEPs) can't be salvaged past.
Fix this by, whenever we duplicate an address computation into a block,
looking for dbg.value users of the original memory address in the same
block, and redirecting those to the local computation.
Differential Revision: https://reviews.llvm.org/D58403
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds intrinsics for the following:
* cmphs, cmphi
* cmpge, cmpgt
* cmpeq, cmpne
* cmplt, cmple
* cmplo, cmpls
Includes a minor change to `TLI.getMemValueType` that fixes a crash due to the
scalable flag being dropped.
Reviewers: sdesmalen, efriedma, rengolin, rovka, dancgr, huntergr
Reviewed By: efriedma
Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70889
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the following changes:
1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats
each constrained intrinsic like a global barrier (e.g. a function call)
and fully serializes all pending chains. This is actually not required;
it is allowed for constrained intrinsics to be reordered w.r.t one
another or (nonvolatile) memory accesses. The MI-level scheduler already
allows for that flexibility, so it makes sense to allow it at the DAG
level as well.
This patch therefore changes the way chains for constrained intrisincs
are created, and handles them basically like load operations are handled.
This has the effect that constrained intrinsics are no longer serialized
against one another or (nonvolatile) loads. They are still serialized
against stores, but that seems hard to change with the current DAG chain
setup, and it also doesn't seem to be a big problem preventing DAG
2) The OPC_CheckFoldableChainNode check requires that each of the
intermediate nodes in a multi-node pattern match only has a single use.
This check tends to fail if those intermediate nodes are strict operations
as those have a chain output that typically indeed has another use.
However, we don't really need to consider chains here at all, since they
will all be rewritten anyway by UpdateChains later. Other parts of the
matcher therefore already ignore chains, but this hasOneUse check doesn't.
This patch replaces hasOneUse by a custom test that verifies there is no
more than one use of any non-chain output value.
In theory, this change could affect code unrelated to strict FP nodes,
but at least on SystemZ I could not find any single instance of that
happening
3) The SystemZ back-end currently does not allow matching multiply-and-
extend operations (32x32 -> 64bit or 64x64 -> 128bit FP multiply) for
strict FP operations. This was not possible in the past due to the
problems described under 1) and 2) above.
With those issues fixed, it is now possible to fully support those
instructions in strict mode as well, and this patch does so.
Differential Revision: https://reviews.llvm.org/D70913
|
|
|
|
|
|
|
| |
That refactoring moves NonRelocatableStringpool into common CodeGen folder.
So that NonRelocatableStringpool could be used not only inside dsymutil.
Differential Revision: https://reviews.llvm.org/D71068
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In general ValueHandleBase::ValueIsRAUWd shouldn't be called when not
all uses of the value were actually replaced, though, currently
formLCSSAForInstructions calls it when it inserts LCSSA-phis.
Calls of ValueHandleBase::ValueIsRAUWd were added to LCSSA specifically
to update/invalidate SCEV. In the best case these calls duplicate some
of the work already done by SE->forgetValue, though in case when SCEV of
the value is SCEVUnknown, SCEV replaces the underlying value of
SCEVUnknown with the new value (i.e. acts like LCSSA-phi actually fully
replaces the value it is created for), which leads to SCEV being
corrupted because LCSSA-phi rarely dominates all uses of its inputs.
Fixes bug https://bugs.llvm.org/show_bug.cgi?id=44058.
Reviewers: fhahn, efriedma, reames, sanjoy.google
Reviewed By: fhahn
Subscribers: hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70593
|
| |
|
|
|
|
|
|
|
|
| |
This is for the case where -gmlt -gsplit-dwarf -fsplit-dwarf-inlining
are used together in some but not all units during LTO (or, in the
reduced case, even without LTO) - ensuring that no split dwarf is used
(because split-dwarf-inlining puts the same data in the .o file, so
there's no need to duplicate it into the .dwo file)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ConstantFoldInsertElementInstruction.
Summary:
Should not constant fold insertelement instruction for scalable vector type.
Reviewers: huntergr, sdesmalen, spatel, levedev.ri, apazos, efriedma, willlovett
Reviewed By: efriedma, spatel
Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70985
|
|
|
|
|
|
|
|
|
|
| |
explicitly return the chain instead of calling getValue on the single SDValue.
We shouldn't assume that the returned result can be used to get
the other result.
This is prep-work for strict FP where we will also need to pass
the chain result along in more cases.
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D68757
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GsymReader classes.
Summary:
Lookup functions are designed to not fully decode a FunctionInfo, LineTable or InlineInfo, they decode only what is needed into a LookupResult object. This allows lookups to avoid costly memory allocations and avoid parsing large amounts of information one a suitable match is found.
LookupResult objects contain the address that was looked up, the concrete function address range, the name of the concrete function, and a list of source locations. One for each inline function, and one for the concrete function. This allows one address to turn into multiple frames and improves the signal you get when symbolicating addresses in GSYM files.
Reviewers: labath, aprantl
Subscribers: mgorny, hiraditya, llvm-commits, lldb-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add an option to allow the attribute propagation on the index to be
disabled, to allow a workaround for issues (such as that fixed by
D70977).
Also move the setting of the WithAttributePropagation flag on the index
into propagateAttributes(), and remove some old stale code that predated
this flag and cleared the maybe read/write only bits when we need to
disable the propagation (previously only when importing disabled, now
also when the new option disables it).
Reviewers: evgeny777, steven_wu
Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70984
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Context *
During register coalescing, we use rematerialization when coalescing is not
possible. That means we may rematerialize a super register when only a smaller
register is actually used.
E.g.,
0B v1 = ldimm 0xFF
1B v2 = COPY v1.low8bits
2B = v2
=>
0B v1 = ldimm 0xFF
1B v2 = ldimm 0xFF
2B = v2.low8bits
Where xB are the slot indexes.
Here v2 grew from a 8-bit register to a 16-bit register.
When that happens and subregister liveness is enabled, we create subranges for
the newly created value.
E.g., before remat, the live range of v2 looked like:
main range: [1r, 2r)
(Reads v2 is defined at index 1 slot register and used before the slot register
of index 2)
After remat, it should look like:
main range: [1r, 2r)
low 8 bits: [1r, 2r)
high 8 bits: [1r, 1d) <-- dead def
I.e., the unsused lanes of v2 should be marked as dead definition.
* The Problem *
Prior to this patch, the live-ranges from the previous exampel, would have the
full live-range for all subranges:
main range: [1r, 2r)
low 8 bits: [1r, 2r)
high 8 bits: [1r, 2r) <-- too long
* The Fix *
Technically, the code that this patch changes is not wrong:
When we create the subranges for the newly rematerialized value, we create only
one subrange for the whole bit mask.
In other words, at this point v2 live-range looks like this:
main range: [1r, 2r)
low & high: [1r, 2r)
Then, it gets wrong when we call LiveInterval::refineSubRanges on low 8 bits:
main range: [1r, 2r)
low 8 bits: [1r, 2r)
high 8 bits: [1r, 2r) <-- too long
Ideally, we would like LiveInterval::refineSubRanges to be able to do the right
thing and mark the dead lanes as such. However, this is not possible, because by
the time we update / refine the live ranges, the IR hasn't been updated yet,
therefore we actually don't have enough information to do the right thing.
Another option to fix the problem would have been to call
LiveIntervals::shrinkToUses after the IR is updated. This is not desirable as
this may have a noticeable impact on compile time.
Instead, what this patch does is when we create the subranges for the
rematerialized value, we explicitly create one subrange for the lanes that were
used before rematerialization and one for the lanes that were not used. The used
one inherits the live range of the main range and the unused one is just created
empty. The existing rematerialization code then detects that the unused one are
not live and it correctly sets dead def intervals for them.
https://llvm.org/PR41372
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
AutoFDO's sample profile loader processes function in arbitrary source code order, so if I change the order of two functions in source code, the inline decision can change. This also prevented the use of context-sensitive profile to do specialization while inlining. This commit enforces SCC top-down order for sample profile loader. With this change, we can now do specialization, as illustrated by the added test case:
Say if we have A->B->C and D->B->C call path, we want to inline C into B when root inliner is B, but not when root inliner is A or D, this is not possible without enforcing top-down order. E.g. Once C is inlined into B, A and D can only choose to inline (B->C) as a whole or nothing, but what we want is only inline B into A and D, not its recursive callee C. If we process functions in top-down order, this is no longer a problem, which is what this commit is doing.
This change is guarded with a new switch "-sample-profile-top-down-load" for tuning, and it depends on D70653. Eventually, top-down can be the default order for sample profile loader.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits, tejohnson
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70655
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
outlined function
Summary:
When sample profile loader decides not to inline a previously inlined call-site, we adjust the profile of outlined function simply by scaling up its profile counts by call-site count. This means the context-sensitive profile of that inlined instance will be thrown away. This commit try to keep context-sensitive profile for such cases:
- Instead of scaling outlined function's profile, we now properly merge the FunctionSamples of inlined instance into outlined function, including all recursively inlined profile.
- Instead of adjusting the profile for negative inline decision at the end of the sample profile loader pass, we do the profile merge right after processing each function. This change paired with top-down ordering of annotation/inline-replay (a separate diff) will make sure we recursively merge profile back before the profile is used for annotation and inline replay.
A new switch -sample-profile-merge-inlinee is added to enable the new profile merge for tuning. It should be the default behavior eventually.
Reviewers: wmi, davidxl
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70653
|
|
|
|
|
|
| |
The loclists_table_base was being overwritten for each CU even though
only one loclists contribution is made so everything but the last CU
would have a label that was never defined and fail to assemble.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Previously we only handled the case where the csect hadn't been set up yet, so we'd hit an assert later on.
Reviewers: jasonliu, DiggerLin, stevewan
Reviewed By: jasonliu
Subscribers: hubert.reinterpretcast, wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71032
|
|
|
|
|
|
|
| |
The commit causes a failure:
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20911
This reverts commit 1847fd9d85506ecee692230cb2500e3774ec628e.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Emit a value debug intrinsic (with OP_deref) when an alloca address is
passed to a function call after going through a bitcast.
This generates an FP or SP-relative location for the local variable in
the following case:
int x;
use((void *)&x;
Reviewers: aprantl, vsk, pcc
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70752
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, it was not possible to skip running the localizer pass
conditionally. This patch adds an input function to the pass which
decides if the pass should run on the given MachineFunction or not.
No test case as there is no upstream target needs this functionality.
Reviewers: qcolombet
Reviewed By: qcolombet
Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71038
|
|
|
|
|
|
| |
single feature flag covers the two places they were used.
Differential Revision: https://reviews.llvm.org/D71048
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This reverts commit c3b06d0c393e533eab712922911d14e5a079fa5d.
Reason for revert: Caused miscompiles when inserting assume for undef.
Also adds a test to prevent similar breakage in future.
Fixes PR44154.
Reviewers: rnk, jdoerfert, efriedma, xbolva00
Reviewed By: rnk
Subscribers: thakis, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D70933
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(to `sub A, zext B -> add A, sext B`)
Summary:
D68408 proposes to greatly improve our negation sinking abilities.
But in current canonicalization, we produce `sub A, zext(B)`,
which we will consider non-canonical and try to sink that negation,
undoing the existing canonicalization.
So unless we explicitly stop producing previous canonicalization,
we will have two conflicting folds, and will end up endlessly looping.
This inverts canonicalization, and adds back the obvious fold
that we'd miss:
* `sub [nsw] Op0, sext/zext (bool Y) -> add [nsw] Op0, zext/sext (bool Y)`
https://rise4fun.com/Alive/xx4
* `sext(bool) + C -> bool ? C - 1 : C`
https://rise4fun.com/Alive/fBl
It is obvious that `@ossfuzz_9880()` / `@lshr_out_of_range()`/`@ashr_out_of_range()`
(oss-fuzz 4871) are no longer folded as much, though those aren't really worrying.
Reviewers: spatel, efriedma, t.p.northover, hfinkel
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71064
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When MUL is the first operand to SUB, we can't use MLS because the accumulator
should be negated. Emit a NEG of the accumulator and an MLA instead, similar to
what we do for FMUL / FSUB fusing.
Reviewers: dmgreen, SjoerdMeijer, fhahn, Gerolf, mstorsjo, asbirlea
Reviewed By: asbirlea
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71067
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch makes sure that the LastThrowing pointer does not point to any instruction deleted by call to DeleteDeadInstruction.
While iterating through the instructions the pass maintains a pointer to the lastThrowing Instruction. A call to deleteDeadInstruction deletes a dead store and other instructions feeding the original dead instruction which also become dead. The instruction pointed by the lastThrowing pointer could also be deleted by the call to DeleteDeadInstruction and thus it becomes a dangling pointer. Because of this, we see an error in the next iteration.
In the patch, we maintain a list of throwing instructions encountered previously and use the last non deleted throwing instruction from the container.
Patch by Ankit <quic_aankit@quicinc.com>
Reviewers: fhahn, bcahoon, efriedma
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D65326
|
|
|
|
| |
Select doesn't change values, so truncate of extended operand cancels out.
|
|
|
|
|
|
|
|
| |
This makes no difference currently because we don't apply FMF
to FP casts, but that may change.
This could also be a place to add a fold for select with fptrunc,
so it will make that patch easier/smaller.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses a performance problem reported in PR43855, and
present in the reapplication in in 001574938e5. It turns out that
MachineSink will (often) move instructions to the first block that
post-dominates the current block, and then try to sink further. This
means if we have a lot of conditionals, we can needlessly create large
numbers of DBG_VALUEs, one in each block the sunk instruction passes
through.
To fix this, rather than immediately sinking DBG_VALUEs, record them in
a pass structure. When sinking is complete and instructions won't be
sunk any further, new DBG_VALUEs are added, avoiding lots of
intermediate DBG_VALUE $noregs being created.
Differential revision: https://reviews.llvm.org/D70676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix part of PR43855, resolving a problem that comes from the reapplication
in 001574938e5. If we have two DBG_VALUE insts in a block that specify
the location of the same variable, for example:
%0 = someinst
DBG_VALUE %0, !123, !DIExpression()
%1 = anotherinst
DBG_VALUE %1, !123, !DIExpression()
if %0 were to sink, the corresponding DBG_VALUE would sink too, past the
next DBG_VALUE, effectively re-ordering assignments. To fix this, I've
added a SeenDbgVars set recording what variable locations have been seen in
a block already (working bottom up), and now flag DBG_VALUEs that would
pass a later DBG_VALUE for the same variable.
NB, this only works for repeated DBG_VALUEs in the same basic block, the
general case involving control flow is much harder, which I've written
up in PR44117.
Differential revision: https://reviews.llvm.org/D70672
|