| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
Many of these uses can get by with forward declarations. Hopefully this
speeds up compilation after adding a single intrinsic.
llvm-svn: 312759
|
| |
|
|
|
|
|
|
|
|
| |
r312318 - Debug info for variables whose type is shrinked to bool
r312325, r312424, r312489 - Test case for r312318
Revision 312318 introduced a null dereference bug.
Details in https://bugs.llvm.org/show_bug.cgi?id=34490
llvm-svn: 312758
|
| |
|
|
|
|
| |
NFC
llvm-svn: 312754
|
| |
|
|
|
|
|
|
| |
It's meaningless and takes up extra space in the line table.
Differential Revision: https://reviews.llvm.org/D37364
llvm-svn: 312751
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Right now Symbols must be either undefined or defined in a specific
section. Some symbols have section indexes like SHN_ABS however. This
change adds support for outputting symbols that have such section
indexes.
Patch by Jake Ehrlich
Differential Revision: https://reviews.llvm.org/D37391
llvm-svn: 312745
|
| |
|
|
|
|
|
|
|
|
| |
It is possible for two modules to have the same name if they are
archive members with the same name, or if we are doing LTO (in which
case all modules will have the name "lto.tmp").
Differential Revision: https://reviews.llvm.org/D37589
llvm-svn: 312744
|
| |
|
|
| |
llvm-svn: 312740
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
For now CUDA-9 is not included in the list of CUDA versions clang
searches for, so the path to CUDA-9 must be explicitly passed
via --cuda-path=.
On LLVM side NVPTX added sm_70 GPU type which bumps required
PTX version to 6.0, but otherwise is equivalent to sm_62 at the moment.
Differential Revision: https://reviews.llvm.org/D37576
llvm-svn: 312734
|
| |
|
|
| |
llvm-svn: 312732
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes some combine issues for AMDGPU where we weren't
getting the many extract_vector_elt combines expected
in a future patch.
This should really be checking isOperationLegalOrCustom on
the extract. That improves a number of x86 lit tests, but
a few get stuck in an infinite loop from one place
where a similar looking extract is created. I have a
different workaround in the backend for that which
keeps many of those improvements, but also adds a few
regressions.
llvm-svn: 312730
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D36862
llvm-svn: 312729
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37397
llvm-svn: 312725
|
| |
|
|
|
|
|
|
| |
These don't add any value as they're just compositions of existing
patterns. However, they can confuse the cost logic in ISel, leading to
duplicated vcvt instructions like in PR33199.
llvm-svn: 312724
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(VF{8|16|32} stride 3).
This patch expands the support of lowerInterleavedload to {8|16|32}x8i stride 3.
LLVM creates suboptimal shuffle code-gen for AVX2. In overall, this patch is a specific fix for the pattern (Strid=3 VF={8|16|32}) and we plan to include the store (deinterleved side).
The patch goal is to optimize the following sequence:
a0 b0 c0 a1 b1 c1 a2 b2
c2 a3 b3 c3 a4 b4 c4 a5
b5 c5 a6 b6 c6 a7 b7 c7
into
a0 a1 a2 a3 a4 a5 a6 a7
b0 b1 b2 b3 b4 b5 b6 b7
c0 c1 c2 c3 c4 c5 c6 c7
Reviewers
1. zvi
2. igor
3. guyblank
4. dorit
5. Ayal
llvm-svn: 312722
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This change converts the `MipsAsmBackend` constructor to the "standard"
form. It makes possible to use `RegisterMCAsmBackend` for the backends
registrations. Now we pass `Triple` instance to the `MipsAsmBackend`
ctor and deduce all required options like endianness and bitness from
the triple. We still need to implement explicit ABI checking for
providing correct options to backends.
Differential revision: https://reviews.llvm.org/D37519
llvm-svn: 312720
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For large basic blocks with lots of combinable instructions, the
MachineTraceMetrics computations in MachineCombiner can dominate the compile
time, as computing the trace information is quadratic in the number of
instructions in a BB and it's relevant successors/predecessors.
In most cases, knowing the instruction depth should be enough to make
combination decisions. As we already iterate over all instructions in a basic
block, the instruction depth can be computed incrementally. This reduces the
cost of machine-combine drastically in cases where lots of instructions
are combined. The major drawback is that AFAIK, computing the critical path
length cannot be done incrementally. Therefore we only compute
instruction depths incrementally, for basic blocks with more
instructions than inc_threshold. The -machine-combiner-inc-threshold
option can be used to set the threshold and allows for easier
experimenting and checking if using incremental updates for all basic
blocks has any impact on the performance.
Reviewers: sanjoy, Gerolf, MatzeB, efriedma, fhahn
Reviewed By: fhahn
Subscribers: kiranchandramohan, javed.absar, efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D36619
llvm-svn: 312719
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This function is used in D36619 to update the instruction depths
incrementally.
Reviewers: efriedma, Gerolf, MatzeB, fhahn
Reviewed By: fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D36696
llvm-svn: 312714
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARM, BPF, MSP430, Sparc and Mips backends all use a similar code sequence
for lowering SelectCC. As pointed out by @reames in D29937, this code isn't
particularly clear and in most of these backends doesn't actually match the
comments. This patch makes the code sequence clearer for the Sparc backend
through better variable naming and more accurate comments (e.g. we are
inserting triangle control flow, _not_ diamond). There is no functional
change.
Differential Revision: https://reviews.llvm.org/D37194
llvm-svn: 312713
|
| |
|
|
|
|
|
|
|
|
| |
removing them"
This temporarily reverts commit 463fa38 (r311401).
See https://bugs.llvm.org/show_bug.cgi?id=34502
llvm-svn: 312708
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add patterns for
fptoui <16 x float> to <16 x i8>
fptoui <16 x float> to <16 x i16>
Reviewers: igorb, delena, craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D37505
llvm-svn: 312704
|
| |
|
|
|
|
|
|
|
|
| |
element types so we can remove some patterns from isel.
Intrinsic handling is still creating these nodes with 32-bit elements as well. But at least this gets rid of 8 and 16.
Ideally, someday we'll convert the intrinsics to generic vector shuffles and remove the intrinsics.
llvm-svn: 312702
|
| |
|
|
|
|
|
| |
Keeping non-i16 extloads makes it easier to match some new
gfx9 load instructions.
llvm-svn: 312699
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current code that handles personality functions when creating a
module summary does not correctly handle the case where a function's
personality function operand refers to the function indirectly
(e.g. via a bitcast). This patch handles such cases by treating
personality function references like any other reference, i.e. by
adding them to the function's reference list. This has the minor side
benefit of allowing personality functions to participate in early
dead stripping.
We do this by calling findRefEdges on the function itself. This way
we also end up handling other function operands (specifically prefix
data and prologue data) for free.
Differential Revision: https://reviews.llvm.org/D37553
llvm-svn: 312698
|
| |
|
|
|
|
|
|
| |
X86ISD::MOVSD.
I don't think we ever generate these. If we did, I would expect we would also be able to generate v16f32 and v8f64, but we don't have those patterns.
llvm-svn: 312694
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Globals that are promoted to an ARM constant pool may alias with another
existing constant pool entry. We need to keep a reference to all globals
that were promoted to each constant pool value so that we can emit a
distinct label for each promoted global. These labels are necessary so
that debug info can refer to the promoted global without an undefined
reference during linking.
Patch by Stephen Crane!
llvm-svn: 312692
|
| |
|
|
|
|
|
|
| |
llvm::Error when creating an irsymtab.
This fixes bitcode emission for modules containing invalid weak externals.
llvm-svn: 312686
|
| |
|
|
| |
llvm-svn: 312685
|
| |
|
|
|
|
|
| |
I empirically verified that open files can in fact be renamed on
Windows with sys::fs::rename, so remove the incorrect code and comment.
llvm-svn: 312683
|
| |
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 312679
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37325
llvm-svn: 312676
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
object files
This change only treats imported and exports functions and globals
as symbol table entries the object has a "linking" section (i.e. it is
relocatable object file).
In this case all globals must be of type I32 and initialized with
i32.const. This was previously being assumed but not checked for and
was causing a failure on big endian machines due to using the wrong
value of then union.
See: https://bugs.llvm.org/show_bug.cgi?id=34487
Differential Revision: https://reviews.llvm.org/D37497
llvm-svn: 312674
|
| |
|
|
|
|
|
|
| |
function.
No functional change intended.
llvm-svn: 312673
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tail merging can convert an undef use into a normal one when creating a
common tail. Doing so can make the register live out from a block which
previously contained the undef use. To keep the liveness up-to-date,
insert IMPLICIT_DEFs in such blocks when necessary.
To enable this patch the computeLiveIns() function which used to
compute live-ins for a block and set them immediately is split into new
functions:
- computeLiveIns() just computes the live-ins in a LivePhysRegs set.
- addLiveIns() applies the live-ins to a block live-in list.
- computeAndAddLiveIns() is a convenience function combining the other
two functions and behaving like computeLiveIns() before this patch.
Based on a patch by Krzysztof Parzyszek <kparzysz@codeaurora.org>
Differential Revision: https://reviews.llvm.org/D37034
llvm-svn: 312668
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider this type of a loop:
for (...) {
...
if (...) continue;
...
}
Normally, the "continue" would branch to the loop control code that
checks whether the loop should continue iterating and which contains
the (often) unique loop latch branch. In certain cases jump threading
can "thread" the inner branch directly to the loop header, creating
a second loop latch. Loop canonicalization would then transform this
loop into a loop nest. The problem with this is that in such a loop
nest neither loop is countable even if the original loop was. This
may inhibit subsequent loop optimizations and be detrimental to
performance.
Differential Revision: https://reviews.llvm.org/D36404
llvm-svn: 312664
|
| |
|
|
|
|
| |
This moves more of our subvector insert/extract tricks to X86InstrVecCompiler.td and refactors them into multiclasses.
llvm-svn: 312661
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D37522
llvm-svn: 312660
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When if-converting a diamond, two separate blocks will be placed back
to back to form a straight line code. To ensure correctness of the
liveness information, any registers that are live in the second block
should not be killed in the first block, even if they were in the
original code.
Additionally, when the two blocks share common instructions at the
beginning, these instructions will not be duplicated, but only placed
once, before both of the blocks. Since the function "isIdenticalTo"
(as used here) ignores kill flags, the common initial code in one
block may have a kill flag for a register that is live in the other
block.
Because the code that removes kill flags only runs for the non-common
parts of the predicated blocks, a kill flag mismatch in the common
code could still lead to a live register being killed prematurely.
llvm-svn: 312654
|
| |
|
|
| |
llvm-svn: 312650
|
| |
|
|
|
|
|
|
|
|
|
|
| |
patterns from SSE and AVX512
This patch moves some of similar non-instruction patterns from X86InstrSSE.td and X86InstrAVX512.td to a common file.
This is intended as a starting point. There are many other optimization patterns that exist in both files that we could move here.
Differential Revision: https://reviews.llvm.org/D37455
llvm-svn: 312649
|
| |
|
|
|
|
|
|
|
| |
Remove code that assumed that a nullptr of address space != 0 couldnt alias with a non-null pointer. This is incorrect, since nothing can be concluded about a null pointer in an address space != 0.
This code was written before address spaces were introduced
Differential Revision: https://reviews.llvm.org/D37518
llvm-svn: 312648
|
| |
|
|
|
|
| |
No functional changes intended.
llvm-svn: 312646
|
| |
|
|
| |
llvm-svn: 312644
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
function return the intrinsics's first argument.
llvm.memcpy/memset/memmove return void but they will return the first
argument after they are expanded as libcalls. Now if the parent function
has any return value, llvm.memcpy cannot be turned into tail call after
expansion.
The patch is to handle that case in SelectionDAGBuilder so when caller
function return the same value as the first argument of llvm.memcpy,
tail call is allowed.
Differential Revision: https://reviews.llvm.org/D37406
llvm-svn: 312641
|
| |
|
|
|
|
|
|
| |
Flat loads do not have vdata operand but have vdst instead.
Differential Revision: https://reviews.llvm.org/D37502
llvm-svn: 312640
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mesa still uses a hack where empty inline assembly is used as a kind of
optimization barrier. This exposed a problem where not enough wait states
were inserted, because the hazard recognizer implicitly assumed that each
inline assembly "instruction" has at least one wait state.
Reviewers: arsenm
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D37205
llvm-svn: 312635
|
| |
|
|
|
|
|
|
|
|
| |
As noted in PR34080, a lot of x87 instructions alter the FPSW status register (or leave it in an undefined state) but aren't tagged as such in the tablegen.
This patch tags the control word, stack, wait and math instructions as altering FPSW, which matches what the AMD APMs suggests happens.
Differential Revision: https://reviews.llvm.org/D36414
llvm-svn: 312629
|
| |
|
|
| |
llvm-svn: 312624
|
| |
|
|
|
|
| |
we don't create a BUILD_VECTOR with an illegal type after type legalization.
llvm-svn: 312621
|
| |
|
|
|
|
|
|
|
|
|
| |
performing a zext of a register.
On the PR there is discussion of how to more effectively handle this,
but this patch prevents us from miscompiling code.
Differential Revision: https://reviews.llvm.org/D37504
llvm-svn: 312620
|
| |
|
|
|
|
| |
This matches what we already do for AVX512. The peephole pass makes up for this in most if not all cases. But this makes isel behavior for these consistent with every other instruction.
llvm-svn: 312613
|