| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0daf86291d3aa04d3cc280cd0ef24abdb0174981.
It was causing an assert in test/CodeGen/AMDGPU/amdpal.ll only on a
release-with-asserts build. I will resubmit the change when I have fixed
that.
Change-Id: If270594eba27a7dc4076bdeab3fa8e6bfda3288a
llvm-svn: 328695
|
|
|
|
|
|
|
| |
Currently this seems to only really be used for debug
info.
llvm-svn: 328677
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
For OS type AMDPAL, the scratch descriptor is loaded from offset 0 of
the GIT, whose 32 bit pointer is in s0 (s8 for gfx9 merged shaders).
This commit fixes that to use offset 0x10 instead of offset 0 for a
compute shader, per the PAL ABI spec.
Reviewers: kzhuravl, nhaehnle, timcorringham
Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits, dstuttard, nhaehnle, arsenm
Differential Revision: https://reviews.llvm.org/D44468
Change-Id: I93dffa647758e37f613bb5e0dfca840d82e6d26f
llvm-svn: 328673
|
|
|
|
|
|
|
|
| |
Before this was not done if the function had no calls in it. This
is still a possible issue with any callable function, regardless
of calls present.
llvm-svn: 328659
|
|
|
|
|
|
|
| |
Only 4 byte alignment is ever useful, so increasing anything
beyond this may require realigning the stack.
llvm-svn: 328656
|
|
|
|
|
|
|
|
| |
The combine on a select of a load only triggers for
addrspace 0, and discards the MachinePointerInfo. The
conservative default needs to be used for this.
llvm-svn: 328652
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a function, s5 is used as the frame base SGPR. If a function
is calling another function, during the call sequence
it is copied to a preserved SGPR and restored.
Before it was possible for the scheduler to move stack operations
before the restore of s5, since there's nothing to associate
a frame index access with the restore.
Add an implicit use of s5 to the adjcallstack pseudo which ends
the call sequence to preven this from happening. I'm not 100%
satisfied with this solution, but I'm not sure what else would be
better.
llvm-svn: 328650
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
llvm-objdump now disassembles unrecognised opcodes as data, using
the .long directive. We treat unrecognised opcodes as being 32 bit
values, so move along 4 bytes rather than the single byte which
previously resulted in a cascade of bogus disassembly following an
unrecognised opcode.
While no solution can always disassemble code that contains
embedded data correctly this provides a significant improvement.
The disassembler will now cope with an arbitrary length section
as it no longer truncates it to a multiple of 4 bytes, and will
use the .byte directive for trailing bytes.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D44685
llvm-svn: 328553
|
|
|
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D44820
Change-Id: I732979e2964006aa15d78a333d8886e6855f319a
llvm-svn: 328496
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.
To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.
Reviewers: tstellar, RKSimon, arsenm
Reviewed By: arsenm
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D44856
llvm-svn: 328429
|
|
|
|
|
|
| |
ValueTypes.h is implemented in IR already.
llvm-svn: 328397
|
|
|
|
|
|
|
|
|
| |
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)
llvm-svn: 328395
|
|
|
|
|
|
|
| |
It's implemented in Target & include from other Target headers, so the
header should be in Target.
llvm-svn: 328392
|
|
|
|
|
|
|
|
|
|
|
| |
attribute for AMDGPU
- Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
- Use function attribute to communicate to the AMDGPU backend to add implicit arguments for OpenCL kernels for the AMDHSA OS.
Differential Revision: https://reviews.llvm.org/D43736
llvm-svn: 328349
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove #include of Transforms/Scalar.h from Transform/Utils to fix layering.
Transforms depends on Transforms/Utils, not the other way around. So
remove the header and the "createStripGCRelocatesPass" function
declaration (& definition) that is unused and motivated this dependency.
Move Transforms/Utils/Local.h into Analysis because it's used by
Analysis/MemoryBuiltins.cpp.
llvm-svn: 328165
|
|
|
|
|
|
|
| |
Reland ISel cycle checking improvements after simplifying node id
invariant traversal and correcting typo.
llvm-svn: 327898
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The docs already claim that this happens, but so far it hasn't. As a
consequence, existing TableGen files get this wrong a lot, but luckily
the fixes are all reasonably straightforward.
To make this work with all the existing forms of self-references (since
the true type of a record is only built up over time), the lookup of
self-references in !cast is delayed until the final resolving step.
Change-Id: If5923a72a252ba2fbc81a889d59775df0ef31164
Reviewers: arsenm, craig.topper, tra, MartinO
Subscribers: wdng, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D44475
llvm-svn: 327849
|
|
|
|
| |
llvm-svn: 327843
|
|
|
|
|
|
|
|
|
| |
Normally DCE kills these, but at -O0 these get left behind
leaving suspicious looking illegal copies.
Replace with IMPLICIT_DEF to avoid iterator issues.
llvm-svn: 327842
|
|
|
|
|
|
| |
as it times out building test-suite on PPC.
llvm-svn: 327778
|
|
|
|
|
|
|
| |
Reland ISel cycle checking improvements after simplifying and reducing
node id invariant traversal.
llvm-svn: 327777
|
|
|
|
| |
llvm-svn: 327774
|
|
|
|
| |
llvm-svn: 327773
|
|
|
|
| |
llvm-svn: 327772
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is a follow-on patch of https://reviews.llvm.org/D44210
Author: FarhanaAleen
Reviewed By: msearles
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44319
llvm-svn: 327726
|
|
|
|
|
|
|
|
|
|
|
| |
opcodes
See bug 36751: https://bugs.llvm.org/show_bug.cgi?id=36751
Differential Revision: https://reviews.llvm.org/D44529
Reviewers: artem.tamazov, arsenm
llvm-svn: 327723
|
|
|
|
|
|
|
|
|
| |
See bug 36355: https://bugs.llvm.org/show_bug.cgi?id=36355
Differential Revision: https://reviews.llvm.org/D44481
Reviewers: artem.tamazov, arsenm
llvm-svn: 327720
|
|
|
|
|
|
|
|
| |
of a single basic block loop. mergeInputScoreBrackets() does this for us; update it so that it processes the single bb's score bracket when processing the single bb's preds. It is, after all, a pred of itself, so it's score bracket is needed.
Differential Revision: https://reviews.llvm.org/D44434
llvm-svn: 327583
|
|
|
|
|
|
|
|
|
| |
See bug 36558: https://bugs.llvm.org/show_bug.cgi?id=36558
Differential Revision: https://reviews.llvm.org/D43950
Reviewers: artem.tamazov, arsenm
llvm-svn: 327299
|
|
|
|
|
|
|
|
|
|
| |
Since the enqueued kernels have internal linkage, their names may be dropped.
In this case, give them unique names __amdgpu_enqueued_kernel or
__amdgpu_enqueued_kernel.n where n is a sequential number starting from 1.
Differential Revision: https://reviews.llvm.org/D44322
llvm-svn: 327291
|
|
|
|
|
|
|
|
|
| |
See bug 36252: https://bugs.llvm.org/show_bug.cgi?id=36252
Differential Revision: https://reviews.llvm.org/D43874
Reviewers: artem.tamazov, arsenm
llvm-svn: 327278
|
|
|
|
| |
llvm-svn: 327269
|
|
|
|
| |
llvm-svn: 327268
|
|
|
|
| |
llvm-svn: 327267
|
|
|
|
| |
llvm-svn: 327234
|
|
|
|
| |
llvm-svn: 327209
|
|
|
|
|
|
|
|
|
|
| |
r327171 "Improve Dependency analysis when doing multi-node Instruction Selection"
r328170 "[DAG] Enforce stricter NodeId invariant during Instruction selection"
Reverting patch as NodeId invariant change is causing pathological
increases in compile time on PPC
llvm-svn: 327197
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instruction Selection makes use of the topological ordering of nodes
by node id (a node's operands have smaller node id than it) when doing
cycle detection. During selection we may violate this property as a
selection of multiple nodes may induce a use dependence (and thus a
node id restriction) between two unrelated nodes. If a selected node
has an unselected successor this may allow us to miss a cycle in
detection an invalid selection.
This patch fixes this by marking all unselected successors of a
selected node have negated node id. We avoid pruning on such negative
ids but still can reconstruct the original id for pruning.
In-tree targets have been updated to replace DAG-level replacements
with ISel-level ones which enforce this property.
This preemptively fixes PR36312 before triggering commit r324359 relands
Reviewers: craig.topper, bogner, jyknight
Subscribers: arsenm, nhaehnle, javed.absar, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D43198
llvm-svn: 327170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
address-space.
Summary: Starting from GCN 2nd generation, ISA supports ds_read_b128 on top of ds_read_b64.
This patch supports ds_read_b128 instruction pattern and generation of this instruction.
In the vectorizer, this patch also widen the vector length so that vectorizer generates
128 bit loads for local address-space which gets translated to ds_read_b128.
Since the performance benefit is not clear; compiler generates ds_read_b128 under -amdgpu-ds128.
Author: FarhanaAleen
Reviewed By: rampitec, arsenm
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44210
llvm-svn: 327153
|
|
|
|
|
|
|
|
| |
GFX9 should select opsel version.
Differential Revision: https://reviews.llvm.org/D44279
llvm-svn: 327106
|
|
|
|
|
|
| |
These are the parameters x86 already uses.
llvm-svn: 327020
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: GCN ISA supports instructions that can read 16 consecutive dwords from memory through the scalar data cache;
loadstoreVectorizer should take advantage of the wider vector length and pack 16/8 elements of dwords/quadwords.
Author: FarhanaAleen
Reviewed By: rampitec
Subscribers: llvm-commits, AMDGPU
Differential Revision: https://reviews.llvm.org/D44179
llvm-svn: 326910
|
|
|
|
|
|
| |
This reverts commit ce988cc100dc65e7c6c727aff31ceb99231cab03.
llvm-svn: 326907
|
|
|
|
| |
llvm-svn: 326904
|
|
|
|
|
|
|
|
|
|
|
|
| |
DAGCombiner::isAfterLegalizeDAG since that's what it checks. NFC
The code checks Level == AfterLegalizeDAG which is the fourth and last of the possible DAG combine stages that we have.
There is a Level called AfterLegalVectorOps, but that's the third DAG combine and it doesn't always run.
A function called isAfterLegalVectorOps should imply it returns true in either of the DAG combines that runs after the legalize vector ops stage, but that's not what this function does.
llvm-svn: 326832
|
|
|
|
|
|
|
|
|
|
|
| |
In case if -mattr used to modify feature set bits in llvm-mc call
getIsaVersion can fail to identify specific ISA due to test mismatch.
Adding default fallback tests which will always correctly report at
least major version.
Differential Revision: https://reviews.llvm.org/D44163
llvm-svn: 326825
|
|
|
|
|
|
|
|
|
|
| |
One addrspacecast disappeared in clang emitted IR for
block invoke function due to adoption of the new
addr space mapping.
Differential Revision: https://reviews.llvm.org/D43785
llvm-svn: 326806
|
|
|
|
| |
llvm-svn: 326715
|
|
|
|
|
|
|
| |
As far as I can tell legalization of weird sizes for the
output type isn't implemented.
llvm-svn: 326714
|
|
|
|
| |
llvm-svn: 326713
|