| Commit message (Collapse) | Author | Age | Files | Lines | 
| ... |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
The second operand of an "ri" instruction may be an immediate, but it may
also be a globalvariable, so we should make any assumptions.
This fixes PR31271.
Differential Revision: https://reviews.llvm.org/D27481
llvm-svn: 288964
 | 
| | 
| 
| 
| 
| 
|  | 
Should be checking if HAVE_CRASHREPORTERCLIENT_H is defined not relying on it having a value.
llvm-svn: 288963
 | 
| | 
| 
| 
| 
| 
|  | 
This patch adds support for round-tripping DWARF debug abbreviations through the obj<->yaml tools.
llvm-svn: 288955
 | 
| | 
| 
| 
| 
| 
|  | 
SMAX/SMIN/UMAX/UMIN opcodes
llvm-svn: 288926
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This is now performed more generally by the target shuffle combine code.
Already covered by tests that were originally added in D7666/rL229480 to support combineVectorZext (or VectorZextCombine as it was known then....).
Differential Revision: https://reviews.llvm.org/D27510
llvm-svn: 288918
 | 
| | 
| 
| 
|  | 
llvm-svn: 288916
 | 
| | 
| 
| 
| 
| 
|  | 
EXTRACT_VECTOR_ELT does support demanded elts if the element index is known and in range.
llvm-svn: 288913
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
This patch attempts to scalarize the operand expressions of predicated
instructions if they were conditionally executed in the original loop. After
scalarization, the expressions will be sunk inside the blocks created for the
predicated instructions. The transformation essentially performs
un-if-conversion on the operands.
The cost model has been updated to determine if scalarization is profitable. It
compares the cost of a vectorized instruction, assuming it will be
if-converted, to the cost of the scalarized instruction, assuming that the
instructions corresponding to each vector lane will be sunk inside a predicated
block, possibly avoiding execution. If it's more profitable to scalarize the
entire expression tree feeding the predicated instruction, the expression will
be scalarized; otherwise, it will be vectorized. We only consider the cost of
the entire expression to accurately estimate the cost of the required
insertelement and extractelement instructions.
Differential Revision: https://reviews.llvm.org/D26083
llvm-svn: 288909
 | 
| | 
| 
| 
|  | 
llvm-svn: 288907
 | 
| | 
| 
| 
|  | 
llvm-svn: 288905
 | 
| | 
| 
| 
| 
| 
|  | 
Also avoid allocating ~3x as much memory as needed.
llvm-svn: 288904
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
of the dominating load.
In the case of a fully redundant load LI dominated by an equivalent load V, GVN
should always preserve the original debug location of V. Otherwise, we risk to
introduce an incorrect stepping.
If V has debug info, then clearly it should not be modified. If V has a null
debugloc, then it is still potentially incorrect to propagate LI's debugloc
because LI may not post-dominate V.
Differential Revision: https://reviews.llvm.org/D27468
llvm-svn: 288903
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
integer domain
We are being inconsistent with these instructions (and all their variants.....) with a random mix of them using the default float domain.
Differential Revision: https://reviews.llvm.org/D27419
llvm-svn: 288902
 | 
| | 
| 
| 
| 
| 
|  | 
by David in D27462. NFC
llvm-svn: 288901
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Preparation work for implementing N32 support.
Patch By: Daniel Sanders
Reviewers: vkalintiris, atanasyan
Differential Revision: https://reviews.llvm.org/D27460
llvm-svn: 288900
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
The non-constant pool version of DecodeVPERMIL2PMask was not offsetting correctly for the second input. I've updated the code to match the implementation in the constant-pool version.
Annoyingly this bug was hidden for so long as it's tricky to combine to useful variable shuffle masks that don't become constant-pool entries.
llvm-svn: 288898
 | 
| | 
| 
| 
| 
| 
|  | 
Fixes PR 31256
llvm-svn: 288897
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
instructions inlined from functions with debug info.
When a function F is inlined, InlineFunction extends the debug location of every
instruction inlined from F by adding an InlinedAt.
However, if an instruction has a 'null' debug location, InlineFunction would
propagate the callsite debug location to it. This behavior existed since
revision 210459.
Revision 210459 was originally committed specifically to workaround the lack of
debug information for instructions inlined from intrinsic functions (which are
usually declared with attributes `__always_inline__, __nodebug__`).
The problem with revision 210459 is that it doesn't make any sort of distinction
between instructions inlined from a 'nodebug' function and instructions which
are inlined from a function built with debug info. This issue may lead to
incorrect stepping in the debugger.
This patch works under the assumption that a nodebug function does not have a
DISubprogram. When a function F is inlined into another function G,
InlineFunction checks if F has debug info associated with it.
For nodebug functions, the InlineFunction logic is unchanged (i.e. it would
still propagate the callsite debugloc to the inlined instructions). Otherwise,
InlineFunction no longer propagates the callsite debug location.
Differential Revision: https://reviews.llvm.org/D27462
llvm-svn: 288895
 | 
| | 
| 
| 
| 
| 
|  | 
I believe this is the cause of the failure, but have not been able to confirm.  Note that this is a speculative fix; I'm still waiting for a full build to finish as I synced and ended up doing a clean build which takes 20+ minutes on my machine.
llvm-svn: 288886
 | 
| | 
| 
| 
|  | 
llvm-svn: 288884
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Patch By: Wei Ding
Summary: This patch fixes the fdiv precision issues.
Reviewers: b-sumner, cfang, wdng, arsenm
Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye
Differential Revision: https://reviews.llvm.org/D26424
llvm-svn: 288879
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
In the addressing mode, signed 9-bit imm is [-256, 255], not [-512, 511].
Differential Revision: https://reviews.llvm.org/D27480
llvm-svn: 288876
 | 
| | 
| 
| 
| 
| 
|  | 
The existing unittests actually cover this now that we verify things.
llvm-svn: 288875
 | 
| | 
| 
| 
|  | 
llvm-svn: 288874
 | 
| | 
| 
| 
| 
| 
|  | 
Remove the unused return type, use early return, use assignment operator.
llvm-svn: 288873
 | 
| | 
| 
| 
| 
| 
|  | 
It doesn't matter why something is overdefined if it is...
llvm-svn: 288871
 | 
| | 
| 
| 
|  | 
llvm-svn: 288867
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reviewers: arsenm, nhaehnle
Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D26725
llvm-svn: 288865
 | 
| | 
| 
| 
|  | 
llvm-svn: 288861
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Requesting metadata for a global is a relatively expensive operation as it
involves a map lookup, but it's one that we need to do relatively frequently in
this pass to collect the list of type metadata nodes associated with a global.
This change improves the performance of type metadata queries by prebuilding
data structures that keep the global together with its list of type metadata,
and changing the pass to use that data structure wherever we were previously
passing global references around.
This change also eliminates some O(N^2) behavior by collecting the list of
globals associated with each type identifier during the first pass over the
list of globals rather than visiting each global to compute that list every
time we add a new type identifier.
Reduces pass runtime on a module containing Chrome's vtables from over 60s
to 0.9s.
Differential Revision: https://reviews.llvm.org/D27484
llvm-svn: 288859
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
On some platforms (like MSP430) the second element of the result
structure for SMULO/UMULO may have a shorter type than the one
returned by SetCC. We need to truncate it to the right type, or
else some incorrect code may be generated later on.
This fixes issue https://github.com/rust-lang/rust/issues/37829
Patch by Vadzim Dambrouski!
Differential Revision: https://reviews.llvm.org/D27154
llvm-svn: 288857
 | 
| | 
| 
| 
| 
| 
|  | 
Other VOP instructions call the output vdst
llvm-svn: 288856
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
As Eli noted in the post-commit thread for r288833, the use of
swapOperands() may not be allowed in InstSimplify, so I'm 
removing those calls here pending further review. 
The swap mutates the icmp, and there doesn't appear to be precedent
for instruction mutation in InstSimplify.
I didn't actually have any tests for those cases, so I'm adding
a few here. 
llvm-svn: 288855
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D27416
llvm-svn: 288852
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
BDCE has two phases:
1. It asks SimplifyDemandedBits if all the bits of an instruction are dead, and if so,
replaces all its uses with the constant zero.
2. Then, it asks SimplifyDemandedBits again if the instruction is really dead
(no side effects etc..) and if so, eliminates it.
Now, in 1) if all the bits of an instruction are dead, we may end up replacing a dbg use:
  %call = tail call i32 (...) @g() #4, !dbg !15
  tail call void @llvm.dbg.value(metadata i32 %call, i64 0, metadata !8, metadata !16), !dbg !17
->
  %call = tail call i32 (...) @g() #4, !dbg !15
  tail call void @llvm.dbg.value(metadata i32 0, i64 0, metadata !8, metadata !16), !dbg !17
but not eliminating the call because it may have arbitrary side effects.
In other words, we lose some debug informations.
This patch fixes the problem making sure that BDCE does nothing with the instruction if
it has side effects and no non-dbg uses.
Differential Revision:  https://reviews.llvm.org/D27471
llvm-svn: 288851
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
If we write an immediate to a VGPR and then copy the VGPR to an
SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than
moving the copy to the SALU.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye
Differential Revision: https://reviews.llvm.org/D27272
llvm-svn: 288849
 | 
| | 
| 
| 
| 
| 
| 
|  | 
We were rounding size in bits down rather than up, leading to 0-sized slots for
i1 (assert!) and bugs for other types not byte-aligned.
llvm-svn: 288848
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld.
On Silvermont [source: Optimization Reference Manual]:
PMULLD has a throughput of 1/11 [instruction/cycles].
PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles].
Fixes pr31202.
Analysis of this issue was done by Fahana Aleen.
Reviewers: wmi, delena, mkuper
Subscribers: RKSimon, llvm-commits
Differential Revision: https://reviews.llvm.org/D27203
llvm-svn: 288844
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted.
Differential Revision: https://reviews.llvm.org/D27461
llvm-svn: 288842
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.
This is the 'and' sibling of the earlier 'or' patch:
https://reviews.llvm.org/rL288833
llvm-svn: 288841
 | 
| | 
| 
| 
|  | 
llvm-svn: 288840
 | 
| | 
| 
| 
| 
| 
|  | 
we don't actually demand that element
llvm-svn: 288839
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
There were two problems:
  + AArch64 was reusing random data from its binary op tables, which is
    complete nonsense for G_SEQUENCE.
  + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE,
    the generic code asserted.
llvm-svn: 288836
 | 
| | 
| 
| 
|  | 
llvm-svn: 288835
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
It'll almost immediately fail because it always tries to half/double the size
until it finds a legal one. Unfortunately, this triggers an assertion
preventing the DAG fallback from being possible.
llvm-svn: 288834
 | 
| | 
| 
| 
| 
| 
| 
| 
|  | 
All of these (and a few more) are already handled by InstCombine,
but we shouldn't have to wait until then to simplify these because
they're cheap to deal with here in InstSimplify.
llvm-svn: 288833
 | 
| | 
| 
| 
|  | 
llvm-svn: 288814
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
Summary:
This is NFC but prevents assertions when PartialMappingIdx is tablegen-erated.
The assumptions were:
1) FirstGPR is 0
2) FirstGPR is the first of the First* enumerators.
GPR32 is changed to 1 to demonstrate that assumption #1 is fixed. #2 will
be covered by a subsequent patch that tablegen-erates information and swaps
the order of GPR and FPR as a side effect.
Depends on D27336
Reviewers: ab, t.p.northover, qcolombet
Subscribers: aemerson, rengolin, vkalintiris, dberris, rovka, llvm-commits
Differential Revision: https://reviews.llvm.org/D27337
llvm-svn: 288812
 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
|  | 
in ValMappings. NFC
Reviewers: ab, t.p.northover, qcolombet
Subscribers: aemerson, rengolin, vkalintiris, dberris, llvm-commits, rovka
Differential Revision: https://reviews.llvm.org/D27336
llvm-svn: 288810
 | 
| | 
| 
| 
|  | 
llvm-svn: 288809
 |