| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 269943
|
| |
|
|
|
|
|
|
|
|
|
| |
Use register class that does not include them when looking
for unallocated registers.
This is hit by the udiv v8i64 test in the opencl integer
conformance test, and takes a few seconds to compile in
a debug build so no test included.
llvm-svn: 269938
|
| |
|
|
|
|
|
|
|
|
| |
Reviewer: tstellardAMD, arsenm
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D20032
llvm-svn: 269725
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was assuming it could use all memory before, which is
a bad decision because it restricts occupancy.
By default, only try to use enough space that could reduce
occupancy to 7, an arbitrarily chosen limit.
Based on the exist LDS usage, try to round up to the limit
in the current tier instead of further hurting occupancy.
This isn't ideal, because it doesn't accurately know how much
space is going to be used for alignment padding.
llvm-svn: 269708
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19794
llvm-svn: 269481
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19793
llvm-svn: 269480
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19792
llvm-svn: 269479
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19791
llvm-svn: 269478
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19790
llvm-svn: 269477
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This will be used for global addresses
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19789
llvm-svn: 269476
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This will be used for GV
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19788
llvm-svn: 269475
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Rename to AMDGPUconstdata_ptr
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19786
llvm-svn: 269474
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellard
Subscribers: arsenm
Differential Revision: http://reviews.llvm.org/D19785
llvm-svn: 269473
|
| |
|
|
|
|
|
|
|
| |
- Insert one nop for each high level statement instead of two
- Do not insert nop before prologue
Differential Revision: http://reviews.llvm.org/D20215
llvm-svn: 269452
|
| |
|
|
|
|
|
|
|
|
| |
We only really need this to be true for SIFixSGPRCopies.
I'm not sure there's any way this could happen before that point.
Fixes a case where MachineCSE could introduce a cross block
scc use.
llvm-svn: 269391
|
| |
|
|
|
|
|
|
|
|
|
| |
- Where we were returning a node before, call ReplaceNode instead.
- Where we would return null to fall back to another selector, rename
the method to try* and return a bool for success.
- Where we were calling SelectNodeTo, just return afterwards.
Part of llvm.org/pr26808.
llvm-svn: 269349
|
| |
|
|
| |
llvm-svn: 269268
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The promote alloca pass would attempt to promote an alloca with
a select, icmp, or phi user, even though the other operand was
from a non-promotable source, producing a select on two different
pointer types.
Only do this if we know that both operands derive from the same
alloca. In the future we should be able to relax this to an alloca
which will also be promoted.
llvm-svn: 269265
|
| |
|
|
| |
llvm-svn: 269147
|
| |
|
|
| |
llvm-svn: 269145
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the ModuleLevelChanges argument, and the ability to create new
subprograms for cloned functions. The latter was added without review in
r203662, but it has no in-tree clients (all non-test callers pass false
for ModuleLevelChanges [1], so it isn't reachable outside of tests). It
also isn't clear that adding a duplicate subprogram to the compile unit is
always the right thing to do when cloning a function within a module. If
this functionality comes back it should be accompanied with a more concrete
use case.
Furthermore, all in-tree clients add the returned function to the module.
Since that's pretty much the only sensible thing you can do with the function,
just do that in CloneFunction.
[1] http://llvm-cs.pcc.me.uk/lib/Transforms/Utils/CloneFunction.cpp/rCloneFunction
Differential Revision: http://reviews.llvm.org/D18628
llvm-svn: 269110
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D20117
llvm-svn: 269098
|
| |
|
|
|
|
|
|
| |
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.
llvm-svn: 269011
|
| |
|
|
| |
llvm-svn: 268931
|
| |
|
|
| |
llvm-svn: 268930
|
| |
|
|
|
|
|
|
|
| |
Some custom Operands and AsmOperandClasses moved to proper place.
No functional changes.
Differential Revision: http://reviews.llvm.org/D20012
llvm-svn: 268780
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for sendmsg(MSG[, OP[, STREAM_ID]]) syntax
in s_sendmsg and s_sendmsghalt instructions.
The syntax matches the SP3 assembler/disassembler rules.
That is why implicit inputs (like M0 and EXEC) are not printed
to disassembly output anymore.
sendmsg(...) allows only known message types and attributes,
even if literals are used instead of symbolic names.
However, raw literal (without "sendmsg") still can be used,
and that allows for any 16-bit value.
Tests updated/added.
Differential Revision: http://reviews.llvm.org/D19596
llvm-svn: 268762
|
| |
|
|
|
|
|
|
| |
This reverts commit 47486d52454d60cdf6becc0b2efe533c73794380.
It broke calling OpenCL kernel from another kernel.
llvm-svn: 268739
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change allows to specify "DefaultMethod" for optional operand (IsOptional = 1) in AsmOperandClass that return default value for operand. This is used in convertToMCInst to set default values in MCInst.
Previously if you wanted to set default value for operand you had to create custom converter method. With this change it is possible to use standard converters even when optional operands presented.
Reviewers: tstellarAMD, ab, craig.topper
Subscribers: jyknight, dsanders, arsenm, nhaustov, llvm-commits
Differential Revision: http://reviews.llvm.org/D18242
llvm-svn: 268726
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Check calling convention in AMDGPUMachineFunction::isKernel
This will be used for AMDGPU_HSA_KERNEL symbol type in output ELF.
Also, in the future unused non-kernels may be optimized.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19917
llvm-svn: 268719
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.
We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.
Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.
llvm-svn: 268693
|
| |
|
|
| |
llvm-svn: 268676
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Discovered by Dave Airlie, fixes an assertion in Khronos OpenGL CTS
GL43-CTS.shader_storage_buffer_object.advanced-matrix.
In this particular case, the buffer load intrinsic fed into a uniform
conditional branch, and led the brcond lowering down the wrong path.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19931
llvm-svn: 268650
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Version 2 is now the default. If you want to emit version 1, use
the amdgcn--amdhsa-amdcov1 triple.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19283
llvm-svn: 268647
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use std::make_pair instead of constructor
Use C++11 loop
Reuse helper var
Reviewers: tstellardAMD
Subsribers: arsenm
Differential Revision: http://reviews.llvm.org/D19787
llvm-svn: 268503
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, axeldavy
Subscribers: MatzeB, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19822
llvm-svn: 268396
|
| |
|
|
| |
llvm-svn: 268392
|
| |
|
|
|
|
|
| |
This will allow us to split up 64-bit private accesses when
necessary.
llvm-svn: 268296
|
| |
|
|
|
|
|
|
|
| |
We were using v_readlane_b32 with the lane set to zero, but this won't
work if thread 0 is not active.
Differential Revision: http://reviews.llvm.org/D19745
llvm-svn: 268295
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Now that unaligned access expansion should not attempt
to produce i64 accesses, we can remove the hack in
PreprocessISelDAG where this is done.
This allows splitting i64 private accesses while
allowing the new add nodes indexing the vector components
can be folded with the base pointer arithmetic.
llvm-svn: 268293
|
| |
|
|
| |
llvm-svn: 268289
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When we restore an SGPR value from scratch, we first load it into a
temporary VGPR and then use v_readlane_b32 to copy the value from the
VGPR back into an SGPR.
We weren't setting the kill flag on the VGPR in the v_readlane_b32
instruction, so the register scavenger wasn't able to re-use this
temp value later.
I wasn't able to create a lit test for this.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19744
llvm-svn: 268287
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: jvesely, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19736
llvm-svn: 268267
|
| |
|
|
|
|
|
| |
We can't use MI->getDebugLoc() when MI is an iterator that could be
MBB.end().
llvm-svn: 268265
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add support for detecting hazards in SMEM soft clauses, so that we only
break the clauses when necessary, either by adding s_nop or re-ordering
other alu instructions.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18870
llvm-svn: 268260
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This intrinsic is used to get flat-shaded fragment shader inputs. Those are
uniform across a primitive, but a fragment shader wave may process pixels from
multiple primitives (as indicated by the prim_mask), and so that's where
divergence can arise.
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19747
llvm-svn: 268259
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18603
llvm-svn: 268247
|
| |
|
|
| |
llvm-svn: 268234
|
| |
|
|
| |
llvm-svn: 268163
|
| |
|
|
|
|
| |
This was supposed to be part of r268143.
llvm-svn: 268154
|