| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
| |
llvm-svn: 277535
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Two types of stores are possible in pixel shaders: stores to memory that are
explicitly requested at the API level, and stores that are an implementation
detail of register spilling or lowering of arrays.
For the first kind of store, we must ensure that helper pixels have no effect
and hence WQM must be disabled. The second kind of store must always be
executed, because the written value may be loaded again in a way that is
relevant for helper pixels as well -- and there are no externally visible
effects anyway.
This is a candidate for the 3.9 release branch.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: https://reviews.llvm.org/D22675
llvm-svn: 277504
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
There are cases where uniform branch conditions are computed in VGPRs, and
we didn't correctly mark those as WQM.
The stray change in basic-branch.ll is because invoking the LiveIntervals
analysis leads to the detection of a dead register that would otherwise not
be seen at -O0.
This is a candidate for the 3.9 branch, as it fixes a possible hang.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22673
llvm-svn: 277500
|
| |
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D22522
llvm-svn: 277344
|
| |
|
|
|
|
| |
Found by asan -fsanitize-address-use-after-scope.
llvm-svn: 277265
|
| |
|
|
|
|
|
| |
This should really be true for any immediate, not just
inline ones.
llvm-svn: 277260
|
| |
|
|
| |
llvm-svn: 277259
|
| |
|
|
| |
llvm-svn: 277258
|
| |
|
|
|
|
|
|
|
| |
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.
Differential Revision: https://reviews.llvm.org/D22885
llvm-svn: 277126
|
| |
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D22021
Reviewed by: arsenm
llvm-svn: 277073
|
| |
|
|
|
|
|
| |
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.
llvm-svn: 277017
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D22482
llvm-svn: 276998
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We were using reserved VGPRs for SGPR spilling and this was causing
some programs with a workgroup size of 1024 to use more than 64
registers, which is illegal.
Reviewers: arsenm, mareko, nhaehnle
Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22032
llvm-svn: 276980
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
SI_ELSE is lowered into two parts:
s_or_saveexec_b64 dst, src (at the start of the basic block)
s_xor_b64 exec, exec, dst (at the end of the basic block)
The idea is that dst contains the exec mask of the preceding IF block. It can
happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside
the basic block that contains SI_ELSE, in which case it introduces an instruction
s_and_b64 exec, exec, s[...]
which masks out bits that can correspond to both the IF and the ELSE paths.
So the resulting sequence must be:
s_or_savexec_b64 dst, src
s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode
s_and_b64 dst, dst, exec <-- added by SILowerControlFlow
s_xor_b64 exec, exec, dst
Whether to add the additional s_and_b64 dst, dst, exec is currently determined
via the ExecModified tracking. With this change, it is instead determined by
an additional flag on SI_ELSE which is set by SIWholeQuadMode.
Finally: It also occured to me that an alternative approach for the long run
is for SILowerControlFlow to unconditionally emit
s_or_saveexec_b64 dst, src
...
s_and_b64 dst, dst, exec
s_xor_b64 exec, exec, dst
and have a pass that detects and cleans up the "redundant AND with exec"
pattern where possible. This could be useful anyway, because we also add
instructions
s_and_b64 vcc, exec, vcc
before s_cbranch_scc (in moveToALU), and those are often redundant. I have
some pending changes to how KILL is lowered that could also benefit from
such a cleanup pass.
In any case, this current patch could help in the short term with the whole
ExecModified business.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: https://reviews.llvm.org/D22846
llvm-svn: 276972
|
| |
|
|
| |
llvm-svn: 276946
|
| |
|
|
|
|
|
| |
This no longer uses the more complicated classification
of constants.
llvm-svn: 276945
|
| |
|
|
|
|
|
|
|
| |
TargetOptions wants the ExceptionHandling enum. Move that to
MCTargetOptions.h to avoid transitively including Dwarf.h everywhere in
clang. Now you can add a DWARF tag without a full rebuild of clang
semantic analysis.
llvm-svn: 276883
|
| |
|
|
|
|
|
|
| |
And implement it for AArch64, supporting x/w ADD/OR.
Differential Revision: https://reviews.llvm.org/D22373
llvm-svn: 276875
|
| |
|
|
|
|
|
| |
Using rcp should be OK for safe math usually, so this
should not be replacing the original fdiv.
llvm-svn: 276823
|
| |
|
|
| |
llvm-svn: 276819
|
| |
|
|
|
|
| |
The intrinsics for these were removed, so this is dead.
llvm-svn: 276805
|
| |
|
|
| |
llvm-svn: 276804
|
| |
|
|
|
|
|
|
|
| |
ABIArgOffset is a problem because properly fsetting the
KernArgSize requires that the reserved area before the
real kernel arguments be correctly aligned, which requires
fixing clover.
llvm-svn: 276766
|
| |
|
|
|
|
|
| |
This could use some additional optimization work
to use mad/mac legacy.
llvm-svn: 276764
|
| |
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D22732
llvm-svn: 276682
|
| |
|
|
| |
llvm-svn: 276680
|
| |
|
|
| |
llvm-svn: 276675
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.
The corresponding change to clang is in: http://reviews.llvm.org/D16538
Patch by: Joel Jones
Differential Revision: https://reviews.llvm.org/D16213
llvm-svn: 276654
|
| |
|
|
|
|
| |
This has been dead since r269479
llvm-svn: 276518
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r276298.
Data stored in .rodata can have a negative offset from .text, but we
don't support negative values in relocations yet.
This caused a regression in one of the amp conformance tests:
5_Data_Cont/5_2_a_v/5_2_3_m/Assignment/Test.02.01
llvm-svn: 276498
|
| |
|
|
|
|
|
|
|
| |
This adds the actual MachineLegalizeHelper to do the work and a trivial pass
wrapper that legalizes all instructions in a MachineFunction. Currently the
only transformation supported is splitting up a vector G_ADD into one acting on
smaller vectors.
llvm-svn: 276461
|
| |
|
|
|
|
|
|
|
| |
The size can exceed s_movk_i32's limit, and we don't
want to use it this early since it inhibits optimizations.
This should probably be merged to the release branch.
llvm-svn: 276438
|
| |
|
|
| |
llvm-svn: 276437
|
| |
|
|
|
|
|
| |
Remove dead code from r600 intrinsic removal.
Remove unset members, rename StackSize to be less ambiguous.
llvm-svn: 276436
|
| |
|
|
|
|
|
| |
R600's i1 fp_to_uint selected but was incorrect according to
what instcombine constant folds to.
llvm-svn: 276435
|
| |
|
|
| |
llvm-svn: 276434
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D22538
llvm-svn: 276298
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D21646
llvm-svn: 276294
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: tstellarAMD, vpykhtin
Subscribers: arsenm, kzhuravl
Differential Revision: https://reviews.llvm.org/D22620
llvm-svn: 276274
|
| |
|
|
| |
llvm-svn: 276257
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D22526
llvm-svn: 276119
|
| |
|
|
|
|
|
|
|
|
|
| |
If 2.5 ulp is acceptable, denormals are not required, and
isn't a reciprocal which will already be handled, replace
with a faster fdiv.
Simplify the lowering tests by using per function
subtarget features.
llvm-svn: 276051
|
| |
|
|
| |
llvm-svn: 276030
|
| |
|
|
|
|
| |
LGTM'd by Matt Arsenault.
llvm-svn: 276029
|
| |
|
|
|
|
|
|
|
|
|
| |
Only if the value is negative or positive is what matters,
so use a constant that doesn't require an instruction to
materialize.
These should really just emit the write exec directly,
but for stick with the kill pseudo-terminator.
llvm-svn: 275988
|
| |
|
|
|
|
|
|
|
| |
Without this fix, releaseSuccessors when InOrOutBlock is
false could release SUs outside the schedule BasicBlock.
Patch by Axel Davy
llvm-svn: 275935
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is to help moveSILowerControlFlow to before regalloc.
There are a couple of tradeoffs with this. The complete CFG
is visible to more passes, the loop body avoids an extra copy of m0,
vcc isn't required, and immediate offsets can be shrunk into s_movk_i32.
The disadvantage is the register allocator doesn't understand that
the single lane's vector is dead within the loop body, so an extra
register is used to outlive the loop block when expanding the
VGPR -> m0 loop. This also now results in worse waitcnt insertion
before the loop instead of after for pending operations at the point
of the indexing, but that should be fixed by future improvements to
cross block waitcnt insertion.
v_movreld_b32's operands are now modeled more correctly since vdst
is not a true output. This is kind of a hack to treat vdst as a
use operand. Extra checking is required in the verifier since
I can't seem to get tablegen to emit an implicit operand for a
virtual register.
llvm-svn: 275934
|
| |
|
|
|
|
| |
This is already casted above so non-null
llvm-svn: 275881
|
| |
|
|
| |
llvm-svn: 275873
|
| |
|
|
| |
llvm-svn: 275871
|