| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 275871
|
|
|
|
| |
llvm-svn: 275369
|
|
|
|
|
|
|
|
| |
Due to visit order problems, in the case of an unaligned copy
the legalized DAG fails to eliminate extra instructions introduced
by the expansion of both unaligned parts.
llvm-svn: 274397
|
|
|
|
|
|
|
|
|
|
| |
There was a combine before to handle the simple copy case.
Split this into handling loads and stores separately.
We might want to change how this handles some of the vector
extloads, since this can result in large code size increases.
llvm-svn: 274394
|
|
|
|
|
|
|
|
|
| |
Split AMDGPUSubtarget into amdgcn/r600 specific subclasses.
This removes most of the static_casting of the basic codegen
classes everywhere, and tries to restrict the features
visible on the wrong target.
llvm-svn: 273652
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main sin this was committing was using terminator
instructions in the middle of the block, and then
not updating the block successors / predecessors.
Split the blocks up to avoid this and introduce new
pseudo instructions for branches taken with exec masking.
Also use a pseudo instead of emitting s_endpgm and erasing
it in the special case of a non-void return.
llvm-svn: 273467
|
|
|
|
|
|
|
|
| |
Points to the start of implicit arguments (appended after explicit arguments)
Differential Revision: http://reviews.llvm.org/D20297
llvm-svn: 273317
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.
This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.
Re-commit this after fixing a bug where we were trying to use a
reference to a Triple object that had already been destroyed.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272705
|
|
|
|
|
|
| |
This reverts commit r272675.
llvm-svn: 272677
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.
This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272675
|
|
|
|
|
|
|
|
| |
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
|
|
|
|
| |
llvm-svn: 266385
|
|
|
|
|
|
|
| |
These are different than atomicrmw add 1 because they have
an additional input value to clamp the result.
llvm-svn: 266074
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implement BUFFER_ATOMIC_CMPSWAP{,_X2} instructions on all GCN targets, and FLAT_ATOMIC_CMPSWAP{,_X2} on CI+.
32-bit instruction variants tested manually on Kabini and Bonaire. Tests and parts of code provided by Jan Veselý.
Patch by: Vedran Miletić
Reviewers: arsenm, tstellarAMD, nhaehnle
Subscribers: jvesely, scchan, kanarayan, arsenm
Differential Revision: http://reviews.llvm.org/D17280
llvm-svn: 265170
|
|
|
|
|
|
|
| |
Move a few functions only used by R600 to R600 specific code,
fix header macros to stop using R600, mark classes as final.
llvm-svn: 263204
|
|
|
|
| |
llvm-svn: 262853
|
|
|
|
|
|
| |
Also fixes missing f32 test.
llvm-svn: 260780
|
|
|
|
|
|
|
| |
These were only sharing some somewhat incorrect
logic for when to scalarize or split vectors.
llvm-svn: 260490
|
|
|
|
|
|
|
| |
These weren't actually sharing anything in the common
LowerLOAD.
llvm-svn: 260398
|
|
|
|
| |
llvm-svn: 259089
|
|
|
|
|
|
| |
Replace tests with lrp with basic IR expansion
llvm-svn: 258612
|
|
|
|
| |
llvm-svn: 258343
|
|
|
|
| |
llvm-svn: 258096
|
|
|
|
|
|
|
|
|
|
| |
This breaks the tests that were meant for testing
64-bit inline immediates, so move those to shl where
they won't be broken up.
This should be repeated for the other related bit ops.
llvm-svn: 258095
|
|
|
|
|
|
| |
64-bit shifts are very slow on some subtargets.
llvm-svn: 258090
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Return values can be stored in SGPRs (i32) and VGPRs (f32).
This will be used by functions which expect some bytecode or other binary to
be appended at the end. It allows defining in which registers the return
values will be stored.
v2: don't do this for compute shaders
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D16033
llvm-svn: 257621
|
|
|
|
|
|
|
| |
The old lowering for uint_to_fp failed opencl conformance.
It might be OK for fast math mode, but I'm not sure.
llvm-svn: 257393
|
|
|
|
|
|
|
|
| |
The hardware instruction's output on 0 is -1 rather than 32.
Eliminate a test and select to -1. This removes an extra instruction
from the compatability function with HSAIL's firstbit instruction.
llvm-svn: 257352
|
|
|
|
| |
llvm-svn: 257348
|
|
|
|
|
|
| |
Also fix bug in vector legalization for bitreverse.
llvm-svn: 255512
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.
InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.
llvm-svn: 250129
|
|
|
|
| |
llvm-svn: 246048
|
|
|
|
|
|
|
|
|
|
| |
AMDGPU implementation
D9746 added general SMAX/SMIN/UMAX/UMIN pattern matching to SelectionDAGBuilder::visitSelect.
Differential Revision: http://reviews.llvm.org/D12007
llvm-svn: 244960
|
|
|
|
|
|
| |
Patch by Zoltan Gilian <zoltan.gilian@gmail.com>
llvm-svn: 243459
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can be done only with moves which theoretically
will optimize better later.
Although this transform increases the instruction count,
it should be code size / cycle count neutral in the worst
VALU case. It also seems to slightly improve a couple
of testcases due to other DAG combines this exposes.
This is probably slightly worse for the SALU case, so
it might be better to handle this during moveToVALU,
although then you lose some simplifications like
the load width reducing in the simple testcase.
llvm-svn: 242177
|
|
|
|
|
|
| |
Patch by: Zoltan Gilian
llvm-svn: 241861
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.
Reviewers: echristo
Subscribers: jholewinski, ted, yaron.keren, rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D11028
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 241775
|
|
|
|
| |
llvm-svn: 239657
|
|
|
|
|
|
| |
This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea.
llvm-svn: 160303
|
|
llvm-svn: 160270
|