| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
CFI emits jump slots for indirect functions as a byte array
constant, and declares function-typed aliases to these constants.
This change fixes AsmPrinter to emit these aliases as function
symbols and not data symbols.
llvm-svn: 254674
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
multiple registers
Currently in LLVM's cost model, a vectorized arithmetic instruction will have
high cost if its type is split into multiple registers. However, this
punishment is too heavy and unnecessary. The overhead of the split should not
be on arithmetic instructions but instructions that implement the split. Note
that during vectorization we have calculated the register pressure, and we
only choose proper interleaving factor (and also vectorization factor) so
that we don't use more registers than the maximum number.
Here is a very simple example: if a vadd has the cost 1, and if we double VF
so that we need two registers to perform it, then its cost will become 4 with
the current implementation, which will prevent us to use larger VF.
Differential revision: http://reviews.llvm.org/D15159
llvm-svn: 254671
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds support for an optional weight when merging profile data with the llvm-profdata tool.
Weights are specified by adding an option ':<weight>' suffix to the input file names.
Adding support for arbitrary weighting of input profile data allows for relative importance to be placed on the
input data from multiple training runs.
Both sampled and instrumented profiles are supported.
Reviewers: dnovillo, bogner, davidxl
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14547
llvm-svn: 254669
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code generation often exposes redundant physical register copies through
virtual registers such as:
%vreg = COPY %PHYSREG
...
%PHYSREG = COPY %vreg
There are cases where no intervening clobber of %PHYSREG occurs, and the
later copy could therefore be removed. In some cases this further allows
us to remove the initial copy.
This patch contains a motivating example which comes from the x86 build
of Chrome, specifically cc::ResourceProvider::UnlockForRead uses
libstdc++'s implementation of hash_map. That example has two tests live
at the same time, and after machine sinking LLVM has confused itself
enough and things spilling EFLAGS is a great idea even though it's
never restored and the comparison results are both live.
Before this patch we have:
DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def>
%vreg1<def> = COPY %EFLAGS; GR64:%vreg1
%EFLAGS<def> = COPY %vreg1; GR64:%vreg1
JNE_1 <BB#1>, %EFLAGS<imp-use>
Both copies are useless. This patch tries to eliminate the later copy in
a generic manner.
dec is especially confusing to LLVM when compared with sub.
I wrote this patch to treat all physical registers generically, but only
remove redundant copies of non-allocatable physical registers because
the allocatable ones caused issues (e.g. when calling conventions weren't
properly modeled) and should be handled later by the register allocator
anyways.
The following tests used to failed when the patch also replaced allocatable
registers:
CodeGen/X86/StackColoring.ll
CodeGen/X86/avx512-calling-conv.ll
CodeGen/X86/copy-propagation.ll
CodeGen/X86/inline-asm-fpstack.ll
CodeGen/X86/musttail-varargs.ll
CodeGen/X86/pop-stack-cleanup.ll
CodeGen/X86/preserve_mostcc64.ll
CodeGen/X86/tailcallstack64.ll
CodeGen/X86/this-return-64.ll
This happens because COPY has other special meaning for e.g. dependency
breakage and x87 FP stack.
Note that all other backends' tests pass.
Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15157
llvm-svn: 254665
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a block has no terminator instructions, getFirstTerminator() returns
end(), which can't be used in dominance checks. Check dominance for phi
operands separately.
Also, remove some bits from WebAssemblyRegStackify.cpp that were causing
trouble on the same testcase; they were left behind from an earlier
experiment.
Differential Revision: http://reviews.llvm.org/D15210
llvm-svn: 254662
|
| |
|
|
|
|
|
|
| |
The compiler can take advantage of the allocation/deallocation
function's properties. We knew how to do this for Itanium but had no
support for MSVC-style functions.
llvm-svn: 254656
|
| |
|
|
|
|
| |
instruction encodings.
llvm-svn: 254652
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These ADJCALLSTACK markers don't generate code, but they keep dynamic
alloca code that calls chkstk out of the prologue.
This slightly pessimizes inalloca calls by preventing some register copy
coalescing, but I can live with that.
Reviewers: qcolombet
Subscribers: hans, llvm-commits
Differential Revision: http://reviews.llvm.org/D15200
llvm-svn: 254645
|
| |
|
|
| |
llvm-svn: 254640
|
| |
|
|
| |
llvm-svn: 254636
|
| |
|
|
| |
llvm-svn: 254631
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14996
llvm-svn: 254629
|
| |
|
|
|
|
| |
The indicies are one-based, not zero-based, per the spec.
llvm-svn: 254626
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix import from module with appending var, which cannot be imported. The
first fix is to remove an overly-aggressive error check.
The second fix is to deal with restructuring introduced to the module
linker yesterday in r254418 (actually, this fix was included already
in r254559, just added some additional cleanup).
Test by Mehdi Amini.
Reviewers: joker.eph, rafael
Subscribers: joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D15156
llvm-svn: 254624
|
| |
|
|
|
|
|
|
|
| |
In the case of a conditional branch without a preceding cmp we used to emit
a "and; cmp; b.eq/b.ne" sequence, use tbz/tbnz instead.
Differential Revision: http://reviews.llvm.org/D15122
llvm-svn: 254621
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Currently "<type> ptr <reg name>" treated as <reg name> in MS inline asm, ignoring the "<type> ptr" completely and possibly ignoring the intention of the user.
Fixed llvm to produce an error when encountering "<type> ptr <reg name>" operands.
For example: andpd xmm1,xmmword ptr xmm1 --> andpd xmm1, xmm1
though andpd has 2 possible matching formats - andpd xmm, xmm/m128
Patch by: ziv.izhar@intel.com
Differential Revision: http://reviews.llvm.org/D14607
llvm-svn: 254607
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15141
llvm-svn: 254598
|
| |
|
|
|
|
|
|
| |
According to x86 spec, fcomip and fucomip should be supported for Intel syntax.
Differential Revision: http://reviews.llvm.org/D15104
llvm-svn: 254595
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is done only when targeting HSA.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D13807
llvm-svn: 254587
|
| |
|
|
|
|
|
|
|
|
| |
This works mostly fine but breaks some stage 1 builders when compiling
compiler-rt on i386. Revert for further investigation as I can't see an
obvious cause/fix.
This reverts commit r254577.
llvm-svn: 254586
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
(not used for now)
The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.
Differential Revision: http://reviews.llvm.org/D9068
llvm-svn: 254577
|
| |
|
|
| |
llvm-svn: 254572
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15167
llvm-svn: 254570
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This replaces DoNotLinkFromSource with ValuesToLink. It also moves the
computation of ValuesToLink earlier.
It is a bit simpler and an important step in slitting the linker into an
ir mover and a linker proper.
The test change is because we now avoid creating dead declarations.
llvm-svn: 254559
|
| |
|
|
|
|
|
|
|
| |
Having to import an alias as declaration is not thinlto specific.
The test difference are because when we already have a decl and we are
not importing it, we just leave the decl alone.
llvm-svn: 254556
|
| |
|
|
| |
llvm-svn: 254555
|
| |
|
|
|
|
| |
Remove unnecessary metadata from a test case.
llvm-svn: 254544
|
| |
|
|
|
|
|
| |
This was an omission when handling COFF style comdats with local keys.
Should fix the sanitizer-windows bot.
llvm-svn: 254543
|
| |
|
|
| |
llvm-svn: 254542
|
| |
|
|
|
|
| |
They are independent.
llvm-svn: 254541
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14508
llvm-svn: 254540
|
| |
|
|
|
|
|
| |
We were failing to copy the fact that the GV is weak and in the case of
an alias, producing invalid IR.
llvm-svn: 254538
|
| |
|
|
|
|
|
|
|
| |
AggressiveAntiDepBreaker was renaming registers specified by the user
for inline assembly. While this will work for compiler-specified
registers, it won't work for user-specified registers, and at the time
this runs, I don't currently see a way to distinguish them.
llvm-svn: 254532
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
discard
Summary: This changes overflow handling during instrumentation profile merge. Rathar than throwing away records that would result in counter overflow, merged counts are instead clamped to the maximum representable value. A warning about counter overflow is still surfaced to the user as before.
Reviewers: dnovillo, davidxl, silvas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14893
llvm-svn: 254525
|
| |
|
|
|
|
|
|
| |
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.
llvm-svn: 254524
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Summary: Only global or readonly segment variables should appear in object files.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15111
llvm-svn: 254519
|
| |
|
|
|
|
|
|
|
|
| |
== C2) -> (A | (C1 ^ C2)) == C2 when C1 ^ C2 is a power of 2.
Differential Revision: http://reviews.llvm.org/D14223
Patch by Amaury SECHET!
llvm-svn: 254518
|
| |
|
|
| |
llvm-svn: 254514
|
| |
|
|
|
|
| |
Adds support for the new Cortex-A35 ARMv8-A core.
llvm-svn: 254503
|
| |
|
|
|
|
| |
not being expanded. Test case included.
llvm-svn: 254501
|
| |
|
|
|
|
|
|
| |
REPLV.PH, REPLV.QB and MTHLIP instructions
Differential Revision: http://reviews.llvm.org/D14527
llvm-svn: 254496
|
| |
|
|
|
|
|
|
|
|
| |
On FMA targets, we can avoid having to load a constant to negate a float/double multiply by instead using a FNMSUB (-(X*Y)-0)
Fix for PR24366
Differential Revision: http://reviews.llvm.org/D14909
llvm-svn: 254495
|
| |
|
|
|
|
|
|
|
| |
I checked and updated the cost of AVX-512 conversion operations. Added cost of conversion operations in DQ mode.
Conversion of illegal types that requires vector split is not calculated right now (like for other X86 targets).
Differential Revision: http://reviews.llvm.org/D15074
llvm-svn: 254494
|
| |
|
|
|
|
|
|
| |
add builtin_ia32_vcomisd and builtin_ia32_vcomisd
Differential Revision: http://reviews.llvm.org/D14331
llvm-svn: 254493
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is very rudimentary support for debug_cu_index, but it is enough to
allow llvm-dwarfdump to find the offsets for contributions and
correctly dump debug_info.
It will need to actually find the real signature of the unit and build
the real hash table with the right number of buckets, as per the DWP
specification.
It will also need to be expanded to cover the tu_index as well.
llvm-svn: 254489
|
| |
|
|
|
|
|
|
| |
This is a superset of the fix done in r254448.
This fixes PR25607.
llvm-svn: 254478
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an
UNDEF value (and worse, it's not what you want with the natural Arch64
implementation).
The generated code is pretty horrific, but I couldn't come up with an obviously
better alternative (if the amount is constant EXTR could help). Turns out
128-bit shifts are just nasty.
rdar://22491037
llvm-svn: 254475
|
| |
|
|
| |
llvm-svn: 254469
|
| |
|
|
| |
llvm-svn: 254468
|
| |
|
|
| |
llvm-svn: 254459
|