| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D21362
llvm-svn: 272919
|
| |
|
|
| |
llvm-svn: 272785
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes two related bugs. First, the generic optimization passes
unfortunately generate negative constant offsets but the hardware treats
SOffset as an unsigned value.
Second, there is a hardware bug on SI and CI, where address clamping in MUBUF
instructions does not work correctly when SOffset is larger than the buffer
size. This patch works around this bug by never using SOffset.
An alternative workaround would be to do the clamping manually when SOffset
is too large, but generating the required code sequence during instruction
selection would be rather involved, and in any case the resulting code would
probably be worse.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360
Reviewers: arsenm, tstellarAMD
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21326
llvm-svn: 272761
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We we have an MCConstantExpr, we can encode it directly into the instruction
instead of emitting fixups.
Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov, arsenm
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21236
Change-Id: I88b3edf288d48e65c5d705fc4850d281f8e36948
llvm-svn: 272750
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We can now reference symbols directly in operands, like this:
s_mov_b32 s0, global
Reviewers: artem.tamazov, vpykhtin, SamWot, nhaustov
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D21038
llvm-svn: 272748
|
| |
|
|
| |
llvm-svn: 272736
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.
This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
the normal rule that the global must have a unique address can be broken without
being observable by the program by performing comparisons against the global's
address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
its own copy of the global if it requires one, and the copy in each linkage unit
must be the same)
- It is a constant or a function (which means that the program cannot observe that
the unique-address rule has been broken by writing to the global)
Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.
See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.
Part of the fix for PR27553.
Differential Revision: http://reviews.llvm.org/D20348
llvm-svn: 272709
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.
This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.
Re-commit this after fixing a bug where we were trying to use a
reference to a Triple object that had already been destroyed.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272705
|
| |
|
|
|
|
| |
This reverts commit r272675.
llvm-svn: 272677
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We now use a standard fixup type applying the pc-relative address of
constant address space variables, and we have the GlobalAddress lowering
code add the required 4 byte offset to the global address rather than
doing it as part of the fixup.
This refactoring will make it easier to use the same code for global
address space variables and also simplifies the code.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21154
llvm-svn: 272675
|
| |
|
|
|
|
|
|
|
|
|
|
| |
source (.option.machine_version...)
The feature allows for conditional assembly etc.
TODO: make those symbols read-only.
Test added.
Differential Revision: http://reviews.llvm.org/D21238
llvm-svn: 272673
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Mesa and other users must set this to enable coalescing:
- STRIDE = 0
- SWIZZLE_ENABLE = 1
This makes one particular compute shader 8x faster.
Reviewers: tstellarAMD, arsenm
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D21136
llvm-svn: 272556
|
| |
|
|
|
|
|
| |
The condition reg of the cndmask_b64 expansion can't be killed by
the first one, and the implicit super register implicit def is needed.
llvm-svn: 272554
|
| |
|
|
|
|
| |
Or replace with llvm::function_ref if it's never stored. NFC intended.
llvm-svn: 272513
|
| |
|
|
|
|
|
|
| |
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.
llvm-svn: 272512
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We need to set the fixup type to FK_Data_4 for the
SCRATCH_RSRC_DWORD[01] symbols, since these require absolute
relocations, and fixup_si_rodata is for relative relocations.
Reviewers: arsenm, kzhuravl
Subscribers: arsenm, kzhuravl, llvm-commits
Differential Revision: http://reviews.llvm.org/D21153
llvm-svn: 272417
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in AMDGPUOperand.
Summary:
sext() modifier is supported in SDWA instructions only for integer operands. Spec is unclear should integer operands support abs and neg modifiers with sext - for now they are not supported.
Renamed InputModsWithNoDefault to FloatInputMods. Added SextInputMods for operands that support sext() modifier.
Added AMDGPUOperand::Modifier struct to handle register and immediate modifiers.
Code cleaning in AMDGPUOperand class: organize method in groups (render-, predicate-methods...).
Reviewers: vpykhtin, artem.tamazov, tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20968
llvm-svn: 272384
|
| |
|
|
| |
llvm-svn: 272364
|
| |
|
|
|
|
| |
Fixes verifier errors after SIShrinkInstructions.
llvm-svn: 272351
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes a bug with ds_*permute instructions where if it was passed a
constant address, then the offset operand would get assigned a register
operand instead of an immediate.
Reviewers: scchan, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19994
llvm-svn: 272349
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: arsenm, axeldavy
Subscribers: MatzeB, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D19823
llvm-svn: 272346
|
| |
|
|
|
|
|
|
| |
The flat atomics could already be selected, but only
when using flat instructions for global memory. Add
patterns for flat addresses.
llvm-svn: 272345
|
| |
|
|
|
|
|
|
|
|
| |
This was using extract_subreg sub0 to extract the low register
of the result instead of sub0_sub1, producing an invalid copy.
There doesn't seem to be a way to use the compound subreg indices
in tablegen since those are generated, so manually select it.
llvm-svn: 272344
|
| |
|
|
| |
llvm-svn: 272338
|
| |
|
|
|
|
|
| |
I'm still not sure under what circumstances the offset here is non-0,
but private memory is not limited to 27-bits.
llvm-svn: 272337
|
| |
|
|
| |
llvm-svn: 272336
|
| |
|
|
|
|
|
|
|
| |
We were using the fast fdiv lowering for all division, implementation of
IEEE754 fdiv is added.
http://reviews.llvm.org/D20557
llvm-svn: 272292
|
| |
|
|
|
|
|
|
|
|
| |
Reviewers: vpykhtin, tstellarAMD
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D21129
llvm-svn: 272255
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The presence of this attribute indicates that VGPR outputs should be computed
in whole quad mode. This will be used by Mesa for prolog pixel shaders, so
that derivatives can be taken of shader inputs computed by the prolog, fixing
a bug.
The generated code could certainly be improved: if a prolog pixel shader is
used (which isn't common in modern OpenGL - they're used for gl_Color, polygon
stipples, and forcing per-sample interpolation), Mesa will use this attribute
unconditionally, because it has to be conservative. So WQM may be used in the
prolog when it isn't really needed, and furthermore a silly back-and-forth
switch is likely to happen at the boundary between prolog and main shader
parts.
Fixing this is a bit involved: we'd first have to add a mechanism by which
LLVM writes the WQM-related input requirements to the main shader part binary,
and then Mesa specializes the prolog part accordingly. At that point, we may
as well just compile a monolithic shader...
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: arsenm, llvm-commits, kzhuravl
Differential Revision: http://reviews.llvm.org/D20839
llvm-svn: 272063
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Author: Wei Ding <wei.ding2@amd.com>
Date: Tue Jun 7 19:04:44 2016 +0000
Differential Revision: http://reviews.llvm.org/D20557
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044
91177308-0d34-0410-b5e6-96231b3b80d8
as it was breaking the bots.
This reverts commit r272044.
llvm-svn: 272056
|
| |
|
|
| |
llvm-svn: 272044
|
| |
|
|
| |
llvm-svn: 271936
|
| |
|
|
|
|
|
|
| |
If we had a constant group address space cast the queue pointer
wasn't enabled for the function, resulting in a crash on noreg
later.
llvm-svn: 271935
|
| |
|
|
|
|
|
|
|
|
|
|
| |
src2 == VCC.
Another step for unification llvm assembler/disassembler with sp3.
Besides, CodeGen output is a bit improved, thus changes in CodeGen tests.
Assembler/Disassembler tests updated/added.
Differential Revision: http://reviews.llvm.org/D20796
llvm-svn: 271900
|
| |
|
|
|
|
|
|
|
|
| |
Test added as per discussion in http://reviews.llvm.org/D20588.
The macro is just a demonstration, useless in practice.
Coding style fixes.
Differential Revision: http://reviews.llvm.org/D20797
llvm-svn: 271675
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
modifiers.
Summary: Depends on D20625
Reviewers: tstellarAMD, vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20674
llvm-svn: 271662
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_dpp and _sdwa suffixes in mnemonics.
Summary:
Added custom converters for SDWA instruction to support optional operands and modifiers.
Support for _dpp and _sdwa suffixes that allows to force DPP or SDWA encoding for instructions.
Reviewers: tstellarAMD, vpykhtin, artem.tamazov
Subscribers: arsenm, kzhuravl
Differential Revision: http://reviews.llvm.org/D20625
llvm-svn: 271655
|
| |
|
|
|
|
|
| |
It can still report the base register, and the uses give up when it
fails.
llvm-svn: 271575
|
| |
|
|
|
|
|
|
|
| |
There are a lot of different kinds of loads to test for,
and these were scattered around inconsistently with
some redundancy. Try to comprehensively test all loads
in a consistent way.
llvm-svn: 271571
|
| |
|
|
| |
llvm-svn: 271567
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the processor name failed to parse for amdgcn,
the resulting output would have R600 ISA in it.
If the processor name was missing or invalid for R600,
the wavefront size would not be set and there would be
crashes from missing itinerary data.
Fixes crashes in future commit caused by dividing by the unset/0
wavefront size.
llvm-svn: 271561
|
| |
|
|
|
|
|
| |
This fixes some verifier errors when trackLivenessAfterRegAlloc is
enabled.
llvm-svn: 271446
|
| |
|
|
|
|
|
| |
This saves an additional run of the DominatorTree and
MachineLoopInfo
llvm-svn: 271444
|
| |
|
|
|
|
| |
Also return a single StringRef instead of building a string.
llvm-svn: 271296
|
| |
|
|
| |
llvm-svn: 271081
|
| |
|
|
|
|
|
|
|
| |
Remove broken patterns matching it. This was matching the
unsafe math pattern and expanding the fix for the buggy instruction
from the pattern. The problems are also on CI. Remove the workarounds
and only use fract with unsafe math or from the intrinsic.
llvm-svn: 271078
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Register numbers may be specified as assembly-time expressions.
This feature can be useful in macros and alike. However, expressions
are supported within sqare braces only.
Sqare braces were initially intended to support specifying of multiple
(pairs/quads...) registers. Syntax like v[8:8] which specifies single register
is also supported. That allows expressions but looks a bit unnatural.
This change supports syntax REG[EXPR].
Tests added.
Differential Revision: http://reviews.llvm.org/D20588
llvm-svn: 270990
|
| |
|
|
|
|
|
| |
clang-tidy's performance-unnecessary-copy-initialization with some manual
fixes. No functional changes intended.
llvm-svn: 270988
|
| |
|
|
|
|
|
| |
Also fold conditions into assert(0) where it makes sense. No functional
change intended.
llvm-svn: 270982
|
| |
|
|
|
|
|
|
|
|
| |
Summary: Enable load-store-opt by default, and update LIT tests.
Reviewers: arsenm
Differential Revision: http://reviews.llvm.org/D20694
llvm-svn: 270894
|