| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
It was accidentally committed.
llvm-svn: 282855
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Previously, when allocating unspillable live ranges, we would never
attempt to split. We would always bail out and try last ditch graph
recoloring.
This patch changes this by attempting to split all live intervals before
performing recoloring.
This fixes LLVM bug PR14879.
I can't add test cases for any backends other than AVR because none of
them have small enough register classes to trigger the bug.
Reviewers: qcolombet
Subscribers: MatzeB
Differential Revision: https://reviews.llvm.org/D25070
llvm-svn: 282852
|
|
|
|
| |
llvm-svn: 282761
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of producing a mapping for all the operands, we only generate a
mapping for the definition. Indeed, the other operands are not
constrained by the instruction and thus, we should leave the choice to
the actual definition to do the right thing.
In pratice this is almost NFC, but with one advantage. We will have only
one instance of OperandsMapping for each copy and phi that map to one
register bank instead of one different instance for each different
number of operands for each copy and phi.
llvm-svn: 282756
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VS debugger doesn't appear to understand the 0x68 or 0x69 type
indices, which were probably intended for use on a platform where a C
'int' is 8 bits. So, use the character types instead. Clang was already
using the character types because '[u]int8_t' is usually defined in
terms of 'char'.
See the Rust issue for screenshots of what VS does:
https://github.com/rust-lang/rust/issues/36646
Fixes PR30552
llvm-svn: 282739
|
|
|
|
|
|
| |
Should not be a functional but an aesthetic change.
llvm-svn: 282669
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step toward statically allocate InstructionMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.
This should already improve compile time by getting rid of a bunch of
memmove of SmallVectors.
llvm-svn: 282643
|
|
|
|
|
|
|
| |
This is a preparatory commit for more TableGen-like structure.
NFC
llvm-svn: 282642
|
|
|
|
|
|
|
|
| |
LiveDebugVariables doesn't propagate DBG_VALUEs accross basic block
boundaries any more; this functionality was split into LiveDebugValues.
We can thus drop the now dead references to LexicalScopes from LiveDebugVariables.
llvm-svn: 282638
|
|
|
|
|
|
|
|
|
|
| |
Normally, if conversion would add implicit uses for redefined registers,
e.g. R0<def> = add_if ..., R0<imp-use>. However, if only subregisters of
R0 are known to be live but not R0 itself, such implicit uses will not be
added, causing prior definitions of such subregisters and R0 itself to
become dead.
llvm-svn: 282626
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This addresses PR26055 LiveDebugValues is very slow.
Contrary to the old LiveDebugVariables pass LiveDebugValues currently
doesn't look at the lexical scopes before inserting a DBG_VALUE
intrinsic. This means that we often propagate DBG_VALUEs much further
down than necessary. This is especially noticeable in large C++
functions with many inlined method calls that all use the same
"this"-pointer.
For example, in the following code it makes no sense to propagate the
inlined variable a from the first inlined call to f() into any of the
subsequent basic blocks, because the variable will always be out of
scope:
void sink(int a);
void __attribute((always_inline)) f(int a) { sink(a); }
void foo(int i) {
f(i);
if (i)
f(i);
f(i);
}
This patch reuses the LexicalScopes infrastructure we have for
LiveDebugVariables to take this into account.
The effect on compile time and memory consumption is quite noticeable:
I tested a benchmark that is a large C++ source with an enormous
amount of inlined "this"-pointers that would previously eat >24GiB
(most of them for DBG_VALUE intrinsics) and whose compile time was
dominated by LiveDebugValues. With this patch applied the memory
consumption is 1GiB and 1.7% of the time is spent in LiveDebugValues.
https://reviews.llvm.org/D24994
Thanks to Daniel Berlin and Keith Walker for reviewing!
llvm-svn: 282611
|
|
|
|
| |
llvm-svn: 282608
|
|
|
|
|
|
|
|
| |
UseAA is enabled."
This reverts commit r282600 due to test failues with MCJIT
llvm-svn: 282604
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
enabled.
Simplify Consecutive Merge Store Candidate Search
Now that address aliasing is much less conservative, push through
simplified store merging search which only checks for parallel stores
through the chain subgraph. This is cleaner as the separation of
non-interfering loads/stores from the store-merging logic.
Whem merging stores, search up the chain through a single load, and
finds all possible stores by looking down from through a load and a
TokenFactor to all stores visited. This improves the quality of the
output SelectionDAG and generally the output CodeGen (with some
exceptions).
Additional Minor Changes:
1. Finishes removing unused AliasLoad code
2. Unifies the the chain aggregation in the merged stores across
code paths
3. Re-add the Store node to the worklist after calling
SimplifyDemandedBits.
4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
arbitrary, but seemed sufficient to not cause regressions in
tests.
This finishes the change Matt Arsenault started in r246307 and
jyknight's original patch.
Many tests required some changes as memory operations are now
reorderable. Some tests relying on the order were changed to use
volatile memory operations
Noteworthy tests:
CodeGen/AArch64/argument-blocks.ll -
It's not entirely clear what the test_varargs_stackalign test is
supposed to be asserting, but the new code looks right.
CodeGen/AArch64/arm64-memset-inline.lli -
CodeGen/AArch64/arm64-stur.ll -
CodeGen/ARM/memset-inline.ll -
The backend now generates *worse* code due to store merging
succeeding, as we do do a 16-byte constant-zero store efficiently.
CodeGen/AArch64/merge-store.ll -
Improved, but there still seems to be an extraneous vector insert
from an element to itself?
CodeGen/PowerPC/ppc64-align-long-double.ll -
Worse code emitted in this case, due to the improved store->load
forwarding.
CodeGen/X86/dag-merge-fast-accesses.ll -
CodeGen/X86/MergeConsecutiveStores.ll -
CodeGen/X86/stores-merging.ll -
CodeGen/Mips/load-store-left-right.ll -
Restored correct merging of non-aligned stores
CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
Improved. Correctly merges buffer_store_dword calls
CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
Improved. Sidesteps loading a stored value and merges two stores
CodeGen/X86/pr18023.ll -
This test has been removed, as it was asserting incorrect
behavior. Non-volatile stores *CAN* be moved past volatile loads,
and now are.
CodeGen/X86/vector-idiv.ll -
CodeGen/X86/vector-lzcnt-128.ll -
It's basically impossible to tell what these tests are actually
testing. But, looks like the code got better due to the memory
operations being recognized as non-aliasing.
CodeGen/X86/win32-eh.ll -
Both loads of the securitycookie are now merged.
CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll -
This test appears to work but no longer exhibits the spill
behavior.
Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight
Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel
Differential Revision: https://reviews.llvm.org/D14834
llvm-svn: 282600
|
|
|
|
|
|
|
|
|
|
|
|
| |
This check currently doesn't seem to do anything useful on any in-tree target:
On non-x86, it always evaluates to false, so we never hit the code path that
creates the shuffle with zero.
On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to
query in general, but doesn't make sense if only restricted to zero blends.
Differential Revision: https://reviews.llvm.org/D24625
llvm-svn: 282567
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The current implementation of isConstantPhysReg() checks for defs of
physical registers to determine if they are constant. Some
architectures (e.g. AArch64 XZR/WZR) have registers that are constant
and may be used as destinations to indicate the generated value is
discarded, preventing isConstantPhysReg() from returning true. This
change adds a TargetRegisterInfo hook that overrides the no defs check
for cases such as this.
Reviewers: MatzeB, qcolombet, t.p.northover, jmolloy
Subscribers: junbuml, aemerson, mcrosier, rengolin
Differential Revision: https://reviews.llvm.org/D24570
llvm-svn: 282543
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Variables are sometimes missing their debug location information in
blocks in which the variables should be available. This would occur
when one or more predecessor blocks had not yet been visited by the
routine which propagated the information from predecessor blocks.
This is addressed by only considering predecessor blocks which have
already been visited.
The solution to this problem was suggested by Daniel Berlin on the
LLVM developer mailing list.
Differential Revision: https://reviews.llvm.org/D24927
llvm-svn: 282506
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many high-performance processors have a dedicated branch predictor for
indirect branches, commonly used with jump tables. As sophisticated as such
branch predictors are, they tend to have well defined limits beyond which
their effectiveness is hampered or even nullified. One such limit is the
number of possible destinations for a given indirect branches that such
branch predictors can handle.
This patch considers a limit that a target may set to the number of
destination addresses in a jump table.
Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar
<aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.
Differential revision: https://reviews.llvm.org/D21940
llvm-svn: 282412
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:
ldr r0, .CPI0
bl printf
bx lr
.CPI0: &format_string
format_string: .asciz "hello, world!\n"
We can emit:
adr r0, .CPI0
bl printf
bx lr
.CPI0: .asciz "hello, world!\n"
This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).
This recommit contains fixes for a nasty bug related to fast-isel fallback - because
fast-isel doesn't know about this optimization, if it runs and emits references to
a string that we inline (because fast-isel fell back to SDAG) we will end up
with an inlined string and also an out-of-line string, and we won't emit the
out-of-line string, causing backend failures.
It also contains fixes for emitting .text relocations which made the sanitizer
bots unhappy.
llvm-svn: 282387
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23984
llvm-svn: 282381
|
|
|
|
|
|
|
| |
This makes it obvious that items in those maps behave like statically
created objects.
llvm-svn: 282327
|
|
|
|
|
|
|
| |
Like partial mappings, as we move toward TableGen'ed information, the
number should reach zero eventually.
llvm-svn: 282325
|
|
|
|
|
|
|
|
| |
This is a step toward statically allocate ValueMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.
llvm-svn: 282324
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were using the address of the unique instance of a partial
mapping in the related map to access this instance. However, when the
map grows, the whole set of instances may be moved elsewhere and the
previous addresses are not valid anymore.
Instead, keep the address of the unique heap allocated instance of a
partial mapping.
Note: I did not see any actual bugs for that problem as the number of
partial mappings dynamically allocated is small (<= 4).
llvm-svn: 282323
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23089
llvm-svn: 282302
|
|
|
|
|
|
| |
NFC
llvm-svn: 282276
|
|
|
|
|
|
|
|
| |
As the development of GlobalISel move forward, this statistic should
strictly decrease until it reaches zero. At this point, it would mean
GlobalISel can replace SDISel (at least on the tested inputs :P).
llvm-svn: 282275
|
|
|
|
|
|
|
|
| |
Collect statistics about the number of partial mappings dynamically
allocated and accessed. Ultimately, when the whole TableGen
infrastructure is set, those numbers should be zero.
llvm-svn: 282274
|
|
|
|
|
|
|
| |
It is less confusing to have the same names in the debug print as the
enum members.
llvm-svn: 282273
|
|
|
|
|
|
| |
NFC
llvm-svn: 282267
|
|
|
|
|
|
| |
NFC
llvm-svn: 282266
|
|
|
|
|
|
| |
NFC
llvm-svn: 282221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the verify method of the ValueMapping class we used to check that the
mapping exactly matches the bits of the input value. This is problematic
for statically allocated mappings because we would need a different
mapping for each different size of the value that maps on one
instruction. For instance, with such scheme, we would need a different
mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit
wide instruction.
Therefore, change the verifier to check that the meaningful bits are
covered by the mapping instead of matching them.
llvm-svn: 282214
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.
We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.
llvm-svn: 282213
|
|
|
|
| |
llvm-svn: 282201
|
|
|
|
|
|
|
|
|
| |
Currently all nodes get added to the NextSU list when they are released,
so any candidate must be in that list, making the heuristic ineffective.
Remove it for now, we can add it back later in a working fashion if
necessary.
llvm-svn: 282200
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to MSDN (see the PR), functions which don't touch any callee-saved
registers (including %rsp) don't need any unwind info.
This patch makes LLVM not emit unwind info for such functions, to save
binary size.
Differential Revision: https://reviews.llvm.org/D24748
llvm-svn: 282185
|
|
|
|
|
|
|
|
|
|
|
|
| |
Correctly use alignment size from loaded size not output value size.
Reviewers: jyknight, tstellarAMD, arsenm
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D23356
llvm-svn: 282177
|
|
|
|
| |
llvm-svn: 282153
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit is basically the first step toward what will
RegisterBankInfo look when it gets TableGen'ed.
It introduces a XXXGenRegisterBankInfo.def file that is what TableGen
will issue at some point. Moreover, the RegBanks field in
RegisterBankInfo changed to reflect the static (compile time) aspect of
the information.
llvm-svn: 282131
|
|
|
|
|
|
|
|
| |
When initializing an instance of OperandsMapper, instead of using
SmallVector::resize followed by std::fill, use the function that
directly does that in SmallVector.
llvm-svn: 282130
|
|
|
|
| |
llvm-svn: 282098
|
|
|
|
|
|
|
|
|
| |
ISel does not handle them correctly yet i.e we crash trying to emit tail call
code.
radar://28407842
llvm-svn: 282088
|
|
|
|
|
|
|
|
| |
We still don't really have an equivalent of "AssertXExt" in DAG, so we don't
exploit the guarantees on the receiving side yet, but this should produce
conservatively correct code on iOS ABIs.
llvm-svn: 282069
|
|
|
|
|
|
|
|
| |
The only implementation that exists immediately looks it up anyway, and the
information is needed to handle various parameter attributes (stored on the
function itself).
llvm-svn: 282068
|
|
|
|
|
|
|
|
|
|
|
|
| |
TargetMachine::getNameWithPrefix and inline the result into the singular
caller." and "Remove more guts of TargetMachine::getNameWithPrefix and
migrate one check to the TLOF mach-o version." temporarily until I can
get the whole call migrated out of the TargetMachine as we could hit
places where TLOF isn't valid.
This reverts commits r281981 and r281983.
llvm-svn: 282028
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This check caused us to skip adding layout information for calls to
alloca in sspreq/sspstrong mode. We check properly for sspstrong later
on (and add the correct layout info when doing so), so removing this
shouldn't hurt.
No test is included, since testing this using lit seems to require
checking for exact offsets in asm, which is something that the lit tests
for this avoid. If someone cares deeply, I'm happy to write a unittest
or something to cover this, but that feels like overkill.
Patch by Daniel Micay.
Differential Revision: https://reviews.llvm.org/D22714
llvm-svn: 282022
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, such section would be marked as SHT_PROGBITS which
makes it impossible to use an initialized C variable declaration
to emit an (allocated) ELF note. The new behavior is also consistent
with ELF assembly parser.
Differential Revision: https://reviews.llvm.org/D24692
llvm-svn: 282010
|
|
|
|
| |
llvm-svn: 281991
|
|
|
|
|
|
|
|
|
|
| |
CodeView has an S_COMPILE3 record to identify the compiler and source language of the compiland. This record comes first in the debug$S section for the compiland. The debuggers rely on this record to know the source language of the code.
There was a little test fallout from introducing a new record into the symbols subsection.
Differential Revision: https://reviews.llvm.org/D24317
llvm-svn: 281990
|