| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Loop's getBlocks returns an ArrayRef.
llvm-svn: 341821
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This already worked if only one register piece was used,
but didn't if a type was split into multiple, unequal
sized pieces.
Fixes not splitting 3i16/v3f16 into two registers for
AMDGPU.
This will also allow fixing the ABI for 16-bit vectors
in a future commit so that it's the same for all subtargets.
llvm-svn: 341801
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
condition operand
This is the DAG equivalent of D51433.
If we know we're not using all vector lanes, use that knowledge to potentially simplify a vselect condition.
The reduction/horizontal tests show that we are eliminating AVX1 operations on the upper half of 256-bit
vectors because we don't need those anyway.
I'm not sure what the pr34592 test is showing. That's run with -O0; is SimplifyDemandedVectorElts supposed
to be running there?
Differential Revision: https://reviews.llvm.org/D51696
llvm-svn: 341762
|
| |
|
|
| |
llvm-svn: 341740
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch removes addBlockByrefAddress(), it is dead code as far as
clang is concerned: Every byref block capture is emitted with a
complex expression that is equivalent to what this function does.
rdar://problem/31629055
Differential Revision: https://reviews.llvm.org/D51763
llvm-svn: 341737
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
MSVC and LLD sort sections ASCII-betically, so we need to use section
names that sort between .CRT$XCA (the start) and .CRT$XCU (the default
priority).
In the general case, use .CRT$XCT12345 as the section name, and let the
linker sort the zero-padded digits.
Users with low priorities typically want to initialize as early as
possible, so use .CRT$XCA00199 for prioties less than 200. This number
is arbitrary.
Implements PR38552.
Reviewers: majnemer, mstorsjo
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D51820
llvm-svn: 341727
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Extend LDV so that stack slot offsets for spilled sub-registers
are added to the emitted debug locations. This is accomplished
by querying InstrInfo::getStackSlotRange().
With this change, LDV will add a DW_OP_plus_uconst operation to
the expression if a sub-register is spilled. Later on, PEI will
add an offset operation for the stack slot, meaning that we will
get expressions of the forms:
* {DW_OP_constu #fp-offset, DW_OP_minus,
DW_OP_plus_uconst #subreg-offset}
* {DW_OP_plus_const #fp-offset,
DW_OP_minus, DW_OP_plus_uconst #subreg-offset}
The two offset operations should ideally be merged.
Reviewers: rnk, aprantl, stoklund
Reviewed By: aprantl
Subscribers: dblaikie, bjope, nemanjai, JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D51612
llvm-svn: 341659
|
| |
|
|
|
|
|
|
| |
Add support for bitcasts from float type to an integer type of the same element bitwidth.
There maybe cases where we need to support different widths (e.g. as SSE __m128i is treated as v2i64) - but I haven't seen cases of this in the wild yet.
llvm-svn: 341652
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The MCID::Flag enumeration now has more than 32 items, this means that
the hasPropertyBundle argument 'Mask' can overflow.
This patch changes the argument to be 64 bits instead.
Patch by Mikael Nilsson.
Differential Revision: https://reviews.llvm.org/D51596
llvm-svn: 341536
|
| |
|
|
|
|
|
|
| |
In DwarfDebug::collectEntityInfo(), if the label entity is processed in
DbgLabels list, it means the label is not optimized out. There is no
need to generate debug info for it with null position.
llvm-svn: 341513
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was proposed as an IR transform in D49306, but it was not clearly justifiable as a canonicalization.
Here, we only do the transform when the target tells us that sqrt can be lowered with inline code.
This is the basic case. Some potential enhancements are in the TODO comments:
1. Generalize the transform for other exponents (allow more than 2 sqrt calcs if that's really cheaper).
2. If we have less fast-math-flags, generate code to avoid -0.0 and/or INF.
3. Allow the transform when optimizing/minimizing size (might require a target hook to get that right).
Note that by default, x86 converts single-precision sqrt calcs into sqrt reciprocal estimate with
refinement. That codegen is controlled by CPU attributes and can be manually overridden. We have plenty
of test coverage for that already, so I didn't bother to include extra testing for that here. AArch uses
its full-precision ops in all cases (not sure if that's the intended behavior or not, but that should
also be covered by existing tests).
Differential Revision: https://reviews.llvm.org/D51630
llvm-svn: 341481
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Normalize common kinds of DWARF sub-expressions to make debug info
encoding a bit more compact:
DW_OP_constu [X < 32] -> DW_OP_litX
DW_OP_constu [all ones] -> DW_OP_lit0, DW_OP_not (64-bit only)
Differential revision: https://reviews.llvm.org/D51640
llvm-svn: 341457
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the FrameAccess struct that was added to the interface
in D51537, since the PseudoValue from the MachineMemoryOperand
can be safely casted to a FixedStackPseudoSourceValue.
Reviewers: MatzeB, thegameg, javed.absar
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D51617
llvm-svn: 341454
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.
Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().
Differential Revision: https://reviews.llvm.org/D50621
llvm-svn: 341446
|
| |
|
|
| |
llvm-svn: 341392
|
| |
|
|
|
|
|
|
| |
Fix remaining cases not committed in https://reviews.llvm.org/D49574
Differential Revision: https://reviews.llvm.org/D50659
llvm-svn: 341380
|
| |
|
|
| |
llvm-svn: 341317
|
| |
|
|
|
|
|
|
|
| |
A condition in isSpillInstruction() updates a small vector rather
than the 'FI' by-ref parameter, which was used in a subsequent
call to 'isSpillSlotObjectIndex()'. This patch fixes the condition
to check the FIs in the vector instead.
llvm-svn: 341305
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For instructions that spill/fill to and from multiple frame-indices
in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot
should return an array of accesses, rather than just the first encounter
of such an access.
This better describes FI accesses for AArch64 (paired) LDP/STP
instructions.
Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D51537
llvm-svn: 341301
|
| |
|
|
|
|
|
|
|
| |
This reverts commit 8f548ff2a1819e1bc051e8218584f1a3d2cf178a.
buildbot failure in LLVM on clang-ppc64be-linux
http://lab.llvm.org:8011/builders/clang-ppc64le-linux/builds/19765
llvm-svn: 341290
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In lib/CodeGen/LiveDebugVariables.cpp, it uses std::prev(MBBI) to
get DebugValue's SlotIndex. However, the previous instruction may be
also a debug instruction. It could not use a debug instruction to query
SlotIndex in mi2iMap.
Scan all debug instructions and use the first debug instruction to query
SlotIndex for following debug instructions. Only handle DBG_VALUE in
handleDebugValue().
Differential Revision: https://reviews.llvm.org/D50621
llvm-svn: 341289
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A follow-up for D49266 / rL337166 + D49497 / rL338044.
This is still the same pattern to check for the [lack of]
signed truncation, but in this case the constants and the predicate
are negated.
https://rise4fun.com/Alive/BDV
https://rise4fun.com/Alive/n7Z
Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma, dmgreen
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D51532
llvm-svn: 341287
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, the SafeStack analysis disallows out-of-bounds writes but not
out-of-bounds reads for mem intrinsics like llvm.memcpy. This could
cause leaks of pointers to the safe stack by leaking spilled registers/
frame pointers. Check for allocas used as source or destination pointers
to mem intrinsics.
Reviewers: eugenis
Reviewed By: eugenis
Subscribers: pcc, llvm-commits, kcc
Differential Revision: https://reviews.llvm.org/D51334
llvm-svn: 341116
|
| |
|
|
| |
llvm-svn: 341103
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
With r295105, some 'noreturn' blocks (those that don't return and have no
successors) may be merged.
If such blocks' predecessors have different outgoing offset or register, don't
report an error in CFIInstrInserter verify().
Thanks to Vlad Tsyrklevich for reporting the issue.
Differential Revision: https://reviews.llvm.org/D51161
llvm-svn: 341087
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is patch 1 of the new DivergenceAnalysis (https://reviews.llvm.org/D50433).
The purpose of this patch is to free up the name DivergenceAnalysis for the new generic
implementation. The generic implementation class will be shared by specialized
divergence analysis classes.
Patch by: Simon Moll
Reviewed By: nhaehnle
Subscribers: jvesely, jholewinski, arsenm, nhaehnle, mgorny, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D50434
Change-Id: Ie8146b11be2c50d5312f30e11c7a3036a15b48cb
llvm-svn: 341071
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a continuation of https://reviews.llvm.org/D49727
Below the original text, current changes in the comments:
Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results.
For example:
extern int bar(int[]);
int foo(int i) {
int a[i]; // VLA
asm volatile(
"mov r7, #1"
:
:
: "r7"
);
return 1 + bar(a);
}
Compiled for thumb, this gives:
$ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb
...
foo:
.fnstart
@ %bb.0: @ %entry
.save {r4, r5, r6, r7, lr}
push {r4, r5, r6, r7, lr}
.setfp r7, sp, #12
add r7, sp, #12
.pad #4
sub sp, #4
movs r1, #7
add.w r0, r1, r0, lsl #2
bic r0, r0, #7
sub.w r0, sp, r0
mov sp, r0
@APP
mov.w r7, #1
@NO_APP
bl bar
adds r0, #1
sub.w r4, r7, #12
mov sp, r4
pop {r4, r5, r6, r7, pc}
...
r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block.
This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now.
The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter.
If we find a reserved register, we print a warning:
repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm]
"mov r7, #1"
^
Reviewers: efriedma, olista01, javed.absar
Reviewed By: efriedma
Subscribers: eraman, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D51165
llvm-svn: 341062
|
| |
|
|
|
|
|
|
|
| |
In computeRegisterLiveness, the max instructions to search
was counting dbg_value instructions, which could potentially
cause an observable codegen change from the presence of debug
info.
llvm-svn: 341028
|
| |
|
|
|
|
|
|
| |
If there is an unused def, this would previously
report that the register was live. Check for uses
first so that it is reported as dead if never used.
llvm-svn: 341027
|
| |
|
|
|
|
|
|
| |
If the end of the block is reached during the scan, check
the live ins of the successors. This was already done in the
other direction if the block entry was reached.
llvm-svn: 341026
|
| |
|
|
|
|
|
|
|
| |
Check that Machine CSE correctly handles during the transformation, the
debug location information for local variables.
Differential Revision: https://reviews.llvm.org/D50887
llvm-svn: 341025
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
If an ABI-like value is used in a different block,
the type split used is not necessarily the same as
the call's ABI. The value is used through an intermediate
copy virtual registers from the other block. This
resulted in copies with inconsistent sizes later.
Fixes regressions since r338197 when AMDGPU started
splitting vector types for calls.
llvm-svn: 341018
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Global variables that are external and zero initialized are
supposed to be merged with global variables in the bss section
rather than the data section.
Reviewers: efriedma, rengolin, t.p.northover, javed.absar, asl, john.brawn, pcc
Reviewed By: efriedma
Subscribers: dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D51379
llvm-svn: 341008
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This should more accurately reflect what the AsmPrinter will actually
do.
This is NFC, as far as I can tell; all the places that might be affected
already have an extra check to avoid using the result of
getPreferredAlignment in this situation.
Differential Revision: https://reviews.llvm.org/D51377
llvm-svn: 340999
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On AMDGPU we have 70 register classes, so iterating over all 70
each time and exiting is costly on the CPU, this flips the loop
around so that it loops over the 70 register classes first,
and exits without doing the inner loop if needed.
On my test just starting radv this takes
RegUsageInfoCollector::runOnMachineFunction
from 6.0% of total time to 2.7% of total time,
and reduces the startup from 2.24s to 2.19s
Patch by David Airlie!
Differential Revision: https://reviews.llvm.org/D48582
llvm-svn: 340993
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
imported from a dll
Variables declared with the dllimport attribute are accessed via a
stub variable named __imp_<var>. In MinGW configurations, variables that
aren't declared with a dllimport attribute might still end up imported
from another DLL with runtime pseudo relocs.
For x86_64, this avoids the risk that the target is out of range
for a 32 bit PC relative reference, in case the target DLL is loaded
further than 4 GB from the reference. It also avoids having to make the
text section writable at runtime when doing the runtime fixups, which
makes it worthwhile to do for i386 as well.
Add stub variables for all dso local data references where a definition
of the variable isn't visible within the module, since the DLL data
autoimporting might make them imported even though they are marked as
dso local within LLVM.
Don't do this for variables that actually are defined within the same
module, since we then know for sure that it actually is dso local.
Don't do this for references to functions, since there's no need for
runtime pseudo relocations for autoimporting them; if a function from
a different DLL is called without the appropriate dllimport attribute,
the call just gets routed via a thunk instead.
GCC does something similar since 4.9 (when compiling with -mcmodel=medium
or large; from that version, medium is the default code model for x86_64
mingw), but only for x86_64.
Differential Revision: https://reviews.llvm.org/D51288
llvm-svn: 340942
|
| |
|
|
|
|
|
|
| |
Adds more divrem folds to try and get in sync with InstructionSimplify
Differential Revision: https://reviews.llvm.org/D50636
llvm-svn: 340919
|
| |
|
|
|
|
|
| |
It broke PPC64 BB:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/23252
llvm-svn: 340906
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I am experimenting with a single split dwarf (.dwo sections in .o files).
I want to make linker to ignore .dwo sections in .o, for that I am trying to add
SHF_EXCLUDE flag ("E") for them in my asm sample.
I found that currently, it is impossible to add any flag for debug sections using llvm-mc.
That happens because we have a set of predefined unique sections created early with default flags:
https://github.com/llvm-mirror/llvm/blob/master/lib/MC/MCObjectFileInfo.cpp#L391
This patch allows a user to add any flags he wants.
I had to edit TargetLoweringObjectFileImpl.cpp to set MetaData type for debug sections.
Their kind was Data by default (so they were allocatable) and so after changes introduced by
this patch the SHF_ALLOC flag was applied for them, what does not make sense for debug sections.
One of OrcJITTests tests failed because of that.
Differential revision: https://reviews.llvm.org/D51361
llvm-svn: 340904
|
| |
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D51384
Added code in LegalizerHelper to widen UADDO/USUBO along with unit
tests.
Reviewed by volkan.
llvm-svn: 340892
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-x86-experimental-vector-widening-legalization
Summary: This is split out from D41062 to cover the code in LegalVectorTypes.cpp
Reviewers: RKSimon, spatel, efriedma
Reviewed By: efriedma
Subscribers: sdardis, jvesely, nhaehnle, jrtc27, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D51337
llvm-svn: 340891
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://reviews.llvm.org/D51197
Currently, IRTranslator (and GISel) seems to be arbitrarily picking
which overflow intrinsics get mapped into opcodes which either have a
carry as an input or not.
For intrinsics such as Intrinsic::uadd_with_overflow, translate it to an
opcode (G_UADDO) which doesn't have any carry inputs (similar to LLVM
IR).
This patch adds 4 missing opcodes for completeness - G_UADDO, G_USUBO,
G_SSUBE and G_SADDE.
llvm-svn: 340865
|
| |
|
|
|
|
|
| |
Avoid hyperlinear cost of inlining MERGE_VALUE node by constructing
temporary vector and doing a single replacement.
llvm-svn: 340853
|
| |
|
|
|
|
|
| |
When making multiple updates to the same SDNode, recompute node
divergence only once after all changes have been made.
llvm-svn: 340852
|
| |
|
|
|
|
|
| |
Check correct SDNode when deciding if we should update the divergence
property.
llvm-svn: 340851
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
resulting load is legal for the target.
Summary:
I'm not sure if this patch is correct or if it needs more qualifying somehow. Bitcast shouldn't change the size of the load so it should be ok? We already do something similar for stores. We'll change the type of a volatile store if the resulting store is Legal or Custom. I'm not sure we should be allowing Custom there...
I was playing around with converting X86 atomic loads/stores(except seq_cst) into regular volatile loads and stores during lowering. This would allow some special RMW isel patterns in X86InstrCompiler.td to be removed. But there's some floating point patterns in there that didn't work because we don't fold (f64 (bitconvert (i64 volatile load))) or (f32 (bitconvert (i32 volatile load))).
Reviewers: efriedma, atanasyan, arsenm
Reviewed By: efriedma
Subscribers: jvesely, arsenm, sdardis, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, arichardson, jrtc27, atanasyan, jfb, llvm-commits
Differential Revision: https://reviews.llvm.org/D50491
llvm-svn: 340797
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
candidates (NFC)"
This causes crashes due to the interleaved dbg.value intrinsics being
left at the end of basic blocks, causing the actual terminators (br,
etc) to be not where they should be (not at the end of the block),
leading to later crashes.
Further discussion on the original commit thread.
This reverts commit r340368.
llvm-svn: 340794
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code that generates the loop definition operand for phis
in the epilog and kernel is incorrect in some cases.
In the kernel, when a phi refers to another phi, the code that
updates PhiOp2 needs to include the stage difference between
the two phis.
In the epilog, the check for using the loop definition instead
of the phi definition uses the StageDiffAdj value (the difference
between the phi stage and the loop definition stage), but the
adjustment is not needed to determine if the current stage
contains an iteration with the loop definition.
Differential Revision: https://reviews.llvm.org/D51167
llvm-svn: 340782
|
| |
|
|
|
|
| |
Follow up to r340655 to fix vector types which are split.
llvm-svn: 340766
|
| |
|
|
|
|
|
|
|
| |
If the liveness of a physical register was invalid, this
was attempting to iterate the subregisters of all register
uses of the instruction, which would assert when it
encountered an implicit virtual register operand.
llvm-svn: 340763
|