| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 314610
|
| |
|
|
| |
llvm-svn: 314609
|
| |
|
|
|
|
| |
inlineCallInstruction. (NFC)
llvm-svn: 314601
|
| |
|
|
|
|
| |
Remove sign extend in register style pattern if the sign is already extended enough
llvm-svn: 314599
|
| |
|
|
| |
llvm-svn: 314598
|
| |
|
|
|
| |
Change-Id: I7831c9febad8e14278a5bc87584a0053dc837be1
llvm-svn: 314596
|
| |
|
|
|
|
|
| |
Causes a segfault on a builtbot (and in our internal bootstrapping of
Clang). See Eli's response on the commit thread.
llvm-svn: 314589
|
| |
|
|
| |
llvm-svn: 314585
|
| |
|
|
|
|
|
|
| |
Implemented by splitting into two v32i8 mulhu/mulhs and concatenating the results.
Differential Revision: https://reviews.llvm.org/D38307
llvm-svn: 314584
|
| |
|
|
| |
llvm-svn: 314579
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a single library build without relaxation options.
When inlined library functions remove fast math attributes
from the functions they are integrated into.
This patch sets relaxation attributes on the functions after
linking provided corresponding relaxation options are given.
Math instructions inside the inlined functions remain to have
no fast flags, but inlining does not prevent fast math
transformations of a surrounding caller code anymore.
Differential Revision: https://reviews.llvm.org/D38325
llvm-svn: 314568
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Currently expandUnalignedLoad/Store uses place holder pointer info for temporary memory operand
in stack, which does not have correct address space. This causes unaligned private double16 load/store to be
lowered to flat_load instead of buffer_load for amdgcn target.
This fixes failures of OpenCL conformance test basic/vload_private/vstore_private on target amdgcn---amdgizcl.
Differential Revision: https://reviews.llvm.org/D35361
llvm-svn: 314566
|
| |
|
|
| |
llvm-svn: 314563
|
| |
|
|
|
|
|
|
|
|
| |
This patch will eliminate redundant intptr/ptrtoint that pessimizes
analyses such as SCEV, AA and will make optimization passes such
as auto-vectorization more powerful.
Differential revision: http://reviews.llvm.org/D37832
llvm-svn: 314561
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r314517.
This commit crashes sanitizer bots, for example:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167
Stack snippet:
...
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0
...
llvm-svn: 314560
|
| |
|
|
|
|
| |
warnings; other minor fixes (NFC).
llvm-svn: 314559
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When type shrinking reductions, we should insert the truncations and extends at
the end of the loop latch block. Previously, these instructions were inserted
at the end of the loop header block. The difference is only a problem for loops
with predicated instructions (e.g., conditional stores and instructions that
may divide by zero). For these instructions, we create new basic blocks inside
the vectorized loop, which cause the loop header and latch to no longer be the
same block. This should fix PR34687.
Reference: https://bugs.llvm.org/show_bug.cgi?id=34687
llvm-svn: 314542
|
| |
|
|
|
|
|
|
|
| |
Also, add a flags field as we will almost certainly
be needing that soon too.
Differential Revision: https://reviews.llvm.org/D38296
llvm-svn: 314534
|
| |
|
|
|
|
|
|
|
|
| |
ConstantPointerNull
The type of a SCEVConstant may not match the corresponding LLVM Value.
In this case, we skip the constant folding for now.
TODO: Replace ConstantInt Zero by ConstantPointerNull
llvm-svn: 314531
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implement the insertion operator for DWARF address ranges so they
are consistently printed as [LowPC, HighPC).
While a dump method might have felt more consistent, it is used
exclusively for printing error messages in the verifier and never used
for actual dumping. Hence this approach is more intuitive and creates
less clutter at the call sites.
Differential revision: https://reviews.llvm.org/D38395
llvm-svn: 314523
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware will only forward EXEC_LO; the high 32 bits will be zero.
Additionally, inline constants do not work. At least,
v_addc_u32_e64 v0, vcc, v0, v1, -1
which could conceivably be used to combine (v0 + v1 + 1) into a single
instruction, acts as if all carry-in bits are zero.
The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine
s_mov_b64 s[0:1], exec
v_cndmask_b32_e64 v0, v1, v2, s[0:1]
into
v_mov_b32 v0, v3
but it's not particularly high priority.
Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.*
llvm-svn: 314522
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target.
However, since it doesn't check its actual users, it will return FREE even in cases
where the GEP cannot be folded away as a part of actual addressing mode.
For example, if an user of the GEP is a call instruction taking the GEP as a parameter,
then the GEP may not be folded in isel.
Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng
Reviewed By: hfinkel
Subscribers: javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38085
llvm-svn: 314517
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement shouldCoalesce() to help regalloc avoid running out of GR128
registers.
If a COPY involving a subreg of a GR128 is coalesced, the live range of the
GR128 virtual register will be extended. If this happens where there are
enough phys-reg clobbers present, regalloc will run out of registers (if
there is not a single GR128 allocatable register available).
This patch tries to allow coalescing only when it can prove that this will be
safe by checking the (local) interval in question.
Review: Ulrich Weigand, Quentin Colombet
https://reviews.llvm.org/D37899
https://bugs.llvm.org/show_bug.cgi?id=34610
llvm-svn: 314516
|
| |
|
|
|
|
|
|
| |
Adds a new combine for: xor(setcc cc, val), 1 --> setcc (invert(cc), val)
Differential Revision: https://reviews.llvm.org/D38161
llvm-svn: 314514
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New instructions are added to AArch32 and AArch64 to aid
floating-point multiplication and addition of complex numbers, where
the complex numbers are packed in a vector register as a pair of
elements. The Imaginary part of the number is placed in the more
significant element, and the Real part of the number is placed in the
less significant element.
This patch adds assembler for the ARM target.
Differential Revision: https://reviews.llvm.org/D36789
llvm-svn: 314511
|
| |
|
|
|
| |
Change-Id: I360abccee12cae29bd2ac4f8399c9ecc92eb7f13
llvm-svn: 314510
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Fix nested callseq* nodes by moving callseq_start after the
arguments calculation to temporary registers, so that callseq* nodes
in resulting DAG are linear.
Recommitting r314497. This version does not contain test which fails
when compiler is not build in debug mode.
Differential Revision: https://reviews.llvm.org/D37328
llvm-svn: 314507
|
| |
|
|
|
|
|
|
|
| |
Added test relies on the compiler being built in debug mode,
which may not be the case.
This reverts commit r314497.
llvm-svn: 314506
|
| |
|
|
|
|
|
|
| |
Add missing license information to MicroMipsInstrFPU.td and
fix most of the formatting errors present. Others will be
addressed in a follow up commits.
llvm-svn: 314505
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This commit adds comments on how the AMDPAL OS type overloads the
existing AMDGPU_ calling conventions used by Mesa, and adds a couple of
new ones.
Reviewers: arsenm, nhaehnle, dstuttard
Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D37752
llvm-svn: 314502
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Added support for scratch (including spilling) for OS type amdpal:
generates code to set up the scratch descriptor if it is needed.
With amdpal, the scratch resource descriptor is loaded from offset 0 of
the global information table. The low 32 bits of the address of the
global information table is passed in s0.
Added amdgpu-git-ptr-high function attribute to hard-wire the high 32
bits of the address of the global information table. If the function
attribute is not specified, or is 0xffffffff, then the backend generates
code to use the high 32 bits of pc.
The documentation for the AMDPAL ABI will be added in a later commit.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye
Differential Revision: https://reviews.llvm.org/D37483
llvm-svn: 314501
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This operating system type represents the AMDGPU PAL runtime, and will
be required by the AMDGPU backend in order to generate correct code for
this runtime.
Currently it generates the same code as not specifying an OS at all.
That will change in future commits.
Patch from Tim Corringham.
Subscribers: arsenm, nhaehnle
Differential Revision: https://reviews.llvm.org/D37380
llvm-svn: 314500
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch introduces 3 helper functions: error(), warn() and note() to
make printing during verification more consistent. When supported, the
respective prefixes are printed in color using the same color scheme as
clang.
Differential revision: https://reviews.llvm.org/D38368
llvm-svn: 314498
|
| |
|
|
|
|
|
|
|
|
| |
Fix nested callseq* nodes by moving callseq_start after the
arguments calculation to temporary registers, so that callseq* nodes
in resulting DAG are linear.
Differential Revision: https://reviews.llvm.org/D37328
llvm-svn: 314497
|
| |
|
|
|
|
|
|
|
|
|
| |
immediate expressions
Allow the proper recognition of Enum values and global variables inside ms inline-asm memory / immediate expressions, as they require some additional overhead and treated incorrect if doesn't early recognized.
supersedes D33278, D35774
Differential Revision: https://reviews.llvm.org/D37412
llvm-svn: 314493
|
| |
|
|
|
|
|
| |
This reverts commit r314253. It causes a miscompile on P100 in an internal
benchmark. Reverting while I investigate.
llvm-svn: 314482
|
| |
|
|
| |
llvm-svn: 314481
|
| |
|
|
| |
llvm-svn: 314479
|
| |
|
|
|
|
|
|
|
| |
This commit yanks out the repeated sections of code in pruneCandidates into
two lambdas: ShouldSkipCandidate and Prune. This simplifies the logic in
pruneCandidates significantly, and reduces the chance of introducing bugs by
folding all of the shared logic into one place.
llvm-svn: 314475
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
X86ISelDAGToDAG tries to analyze ANDs compared with 0 to optimize to narrower immediates using subregisters.
I don't think we should be optimizing to 16-bit test instructions. It goes against our normal behavior of promoting i16 operations to i32. It only saves one byte due to the need to add a 0x66 prefix. I think it would also be subject to a length changing prefix penalty in the decoders on Intel CPUs.
Reviewers: RKSimon, zvi, spatel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D38273
llvm-svn: 314474
|
| |
|
|
|
|
|
|
|
|
| |
LR is an untypical callee saved register in that it is restored into a
different register (PC) and thus does not live-out of the return block.
This case requires the `Restored` flag in CalleeSavedInfo to be cleared.
This fixes a number of cases where this wasn't handled correctly yet.
llvm-svn: 314471
|
| |
|
|
|
| |
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 314469
|
| |
|
|
|
|
| |
other minor fixes (NFC).
llvm-svn: 314467
|
| |
|
|
|
|
|
|
|
|
|
| |
The expensive-checks build bot found a problem with the r314428 commit:
if CC is live after a ATOMIC_CMP_SWAPW instruction, it needs to be
marked as live-in to the block after the loop the pseudo gets expanded
to. This actually fixes a code-gen bug as well, since if the CC isn't
live, the CR and JLH are merged to a CRJLH which doesn't actually set
the condition code any more.
llvm-svn: 314465
|
| |
|
|
|
|
|
|
| |
If we have BWI, we can truncate in a much simpler way by using vpmovwb. This even works without VLX by using the wider zmm->ymm truncate with a subvector extract.
Differential Revision: https://reviews.llvm.org/D38375
llvm-svn: 314457
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In setupEntryBlockAndCallSites in CodeGen/SjLjEHPrepare.cpp,
we fetch and store the actual frame pointer, but on return via
the longjmp intrinsic, it always was restored into the r7 variable.
On windows, the frame pointer should be restored into r11 instead of r7.
On Darwin (where sjlj exception handling is used by default), the frame
pointer is always r7, both in arm and thumb mode, and likewise, on
windows, the frame pointer always is r11.
On linux however, if sjlj exception handling is enabled (which it isn't
by default), libcxxabi and the user code can be built in differing modes
using different registers as frame pointer. Therefore, when restoring
registers on a platform where we don't always use the same register
depending on code mode, restore both r7 and r11.
Differential Revision: https://reviews.llvm.org/D38253
llvm-svn: 314451
|
| |
|
|
|
|
|
|
| |
it isn't default
Differential Revision: https://reviews.llvm.org/D38252
llvm-svn: 314450
|
| |
|
|
| |
llvm-svn: 314449
|
| |
|
|
|
|
|
|
| |
LowerMULH
We aren't do any in register extends here so we should be able to just the target independent nodes directly and allow them to be lowered as necessary.
llvm-svn: 314447
|
| |
|
|
|
|
| |
same if. Remove one that is redundant with another subtarget features.
llvm-svn: 314446
|