| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 239657
|
|
|
|
| |
llvm-svn: 239377
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we sometimes know the address space, this can
theoretically do a better job.
This needs better test coverage, but this mostly depends on
first updating the loop optimizatiosn to provide the address
space.
llvm-svn: 239053
|
|
|
|
|
|
|
| |
Mostly argument loads were producing broken zextloads
from an FP type.
llvm-svn: 239049
|
|
|
|
| |
llvm-svn: 238789
|
|
|
|
|
|
|
|
|
|
| |
This is important because of different addressing modes
depending on the address space for GPU targets.
This only adds the argument, and does not update
any of the uses to provide the correct address space.
llvm-svn: 238723
|
|
|
|
|
|
|
| |
Instead add m0 as an implicit operand. This helps avoid spills
of the m0 register in some cases.
llvm-svn: 237140
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead add m0 as an implicit operand. This allows us to avoid using
the M0Reg register class and eliminates a number of unnecessary spills
when using s_sendmsg instructions. This impacts one shader in the
shader-db:
SGPRS: 48 -> 40 (-16.67 %)
VGPRS: 112 -> 108 (-3.57 %)
Code Size: 40132 -> 38796 (-3.33 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Scratch: 2048 -> 0 (-100.00 %) bytes per wave
llvm-svn: 237133
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
llvm-svn: 235989
|
|
|
|
|
|
|
| |
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870
llvm-svn: 235987
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).
Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.
Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.
This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.
Differential Revision: http://reviews.llvm.org/D9084
llvm-svn: 235977
|
|
|
|
|
|
|
|
| |
v2: Add tests
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: arsenm
llvm-svn: 234716
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is currently considered experimental, but most of the more
commonly used instructions should work.
So far only SI has been extensively tested, CI and VI probably work too,
but may be buggy. The current set of tests cases do not give complete
coverage, but I think it is sufficient for an experimental assembler.
See the documentation in R600Usage for more information.
llvm-svn: 234381
|
|
|
|
|
|
|
| |
Other f64 opcodes not supported on SI can be lowered in a similar way.
v2: use complex VOP3 patterns
llvm-svn: 233076
|
|
|
|
|
|
|
|
|
| |
V_FRACT is buggy on SI.
R600-specific code is left intact.
v2: drop the multiclass, use complex VOP3 patterns
llvm-svn: 233075
|
|
|
|
|
|
|
|
|
|
| |
There are no opcodes for this. This also adds a test case.
v2: make test more robust
Patch by: Grigori Goronzy
llvm-svn: 232386
|
|
|
|
|
|
| |
classes.
llvm-svn: 231954
|
|
|
|
|
|
|
|
| |
In theory this allows the compiler to skip materializing the array on
the stack. In practice clang often fails to do that, but that's a
different story. NFC.
llvm-svn: 231571
|
|
|
|
|
|
|
|
|
| |
This required plumbing a TargetRegisterInfo through computeRegisterProperties
and into findRepresentativeClass which uses it for register class
iteration. This required passing a subtarget into a few target specific
initializations of TargetLowering.
llvm-svn: 230583
|
|
|
|
|
|
|
| |
We legalize mubuf instructions post-instruction selection, so this
code is no longer needed.
llvm-svn: 230352
|
|
|
|
|
|
|
|
|
|
|
| |
Everyone except R600 was manually passing the length of a static array
at each callsite, calculated in a variety of interesting ways. Far
easier to let ArrayRef handle that.
There should be no functional change, but out of tree targets may have
to tweak their calls as with these examples.
llvm-svn: 230118
|
|
|
|
|
|
|
|
|
|
| |
The expansion code does the same thing. Since
the operands were not defined with the correct
types, this has the side effect of fixing operand
folding since the expanded pseudo would never use
SGPRs or inline immediates.
llvm-svn: 230072
|
|
|
|
|
|
|
|
|
|
|
| |
This enables a few useful combines that used to only
use fma.
Also since v_mad_f32 apparently does not support denormals,
disable the existing cases that are custom handled if they are
requested.
llvm-svn: 230071
|
|
|
|
|
|
| |
Same functionality, but hoists the vector growth out of the loop.
llvm-svn: 229500
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.
Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.
Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering
llvm-svn: 229413
|
|
|
|
|
|
| |
This version passes the OpenCL conformance test.
llvm-svn: 229239
|
|
|
|
|
|
|
| |
This requires considering the size of the operand when
checking immediate legality.
llvm-svn: 229135
|
|
|
|
|
|
| |
We were previously hard-coding soffset to 0.
llvm-svn: 228775
|
|
|
|
| |
llvm-svn: 228190
|
|
|
|
|
|
|
|
|
|
|
| |
v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for
all address spaces. We had marked them as custom in order to lower
them for the private address space, but this is no longer necessary.
This enables lowering of misaligned stores of these types in the
DAGLegalizer.
llvm-svn: 228189
|
|
|
|
|
|
|
|
| |
This is true for SI only. CI+ supports unaligned memory accesses,
but this requires driver support, so for now we disallow unaligned
accesses for all GCN targets.
llvm-svn: 227822
|
|
|
|
|
|
| |
without a Function argument.
llvm-svn: 227638
|
|
|
|
|
|
|
|
|
| |
Add tests for the various combines. This should
always be at least cycle neutral on all subtargets for f64,
and faster on some. For f32 we should prefer selecting
v_mad_f32 over v_fma_f32.
llvm-svn: 227484
|
|
|
|
| |
llvm-svn: 227483
|
|
|
|
|
|
|
| |
This is disabled by default, but can be enabled with the subtarget
feature: 'vgpr-spilling'
llvm-svn: 226597
|
|
|
|
|
|
|
|
|
| |
Don't do the v4i8 -> v4f32 combine if the load will need to
be expanded due to alignment. This stops adding instructions
to repack into a single register that the v_cvt_ubyteN_f32
instructions read.
llvm-svn: 225926
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that the source and destination types can be specified,
allow doing an expansion that doesn't use an EXTLOAD of the
result type. Try to do a legal extload to an intermediate type
and extend that if possible.
This generalizes the special case custom lowering of extloads
R600 has been using to work around this problem.
This also happens to fix a bug that would incorrectly use more
aligned loads than should be used.
llvm-svn: 225925
|
|
|
|
|
|
|
|
| |
The backend now assumes that all immediates are integers. This allows
us to simplify immediate handling code, becasue we no longer need to
handle fp and integer immediates differently.
llvm-svn: 225844
|
|
|
|
|
|
|
| |
None of these are legal types already, so they default to
Expand.
llvm-svn: 225728
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some operands which can take either immediates or registers
and we were previously using different register class to distinguish
between operands that could take immediates and those that could not.
This patch switches to using RegisterOperands which should simplify the
backend by reducing the number of register classes and also make it
easier to implement the assembler.
llvm-svn: 225662
|
|
|
|
|
|
|
|
|
| |
Its functionality has been replaced by calling
SIInstrInfo::legalizeOperands() from
SIISelLowering::AdjstInstrPostInstrSelection() and running the
SIFoldOperands and SIShrinkInstructions passes.
llvm-svn: 225445
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
type (in addition to the memory type).
The *LoadExt* legalization handling used to only have one type, the
memory type. This forced users to assume that as long as the extload
for the memory type was declared legal, and the result type was legal,
the whole extload was legal.
However, this isn't always the case. For instance, on X86, with AVX,
this is legal:
v4i32 load, zext from v4i8
but this isn't:
v4i64 load, zext from v4i8
Whereas v4i64 is (arguably) legal, even without AVX2.
Note that the same thing was done a while ago for truncstores (r46140),
but I assume no one needed it yet for extloads, so here we go.
Calls to getLoadExtAction were changed to add the value type, found
manually in the surrounding code.
Calls to setLoadExtAction were mechanically changed, by wrapping the
call in a loop, to match previous behavior. The loop iterates over
the MVT subrange corresponding to the memory type (FP vectors, etc...).
I also pulled neighboring setTruncStoreActions into some of the loops;
those shouldn't make a difference, as the additional types are illegal.
(e.g., i128->i1 truncstores on PPC.)
No functional change intended.
Differential Revision: http://reviews.llvm.org/D6532
llvm-svn: 225421
|
|
|
|
|
|
|
|
|
|
|
| |
Use VGPR_32 register class instead. These two register classes were
identical and having separate classes was causing
SIInstrInfo::isLegalOperands() to be overly conservative in some cases.
This change is necessary to prevent future paches from missing a folding
opportunity in fneg-fabs.ll.
llvm-svn: 225382
|
|
|
|
| |
llvm-svn: 225310
|
|
|
|
| |
llvm-svn: 225307
|
|
|
|
| |
llvm-svn: 225306
|
|
|
|
|
|
|
|
|
| |
Extend the existing code which handles this for zext. This makes this
more useful for targets with ZeroOrNegativeOne BooleanContent and
obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne)
since the constant will now be shrunk to i1.
llvm-svn: 224691
|
|
|
|
| |
llvm-svn: 224458
|
|
|
|
|
|
|
|
| |
This is nice for the instruction patterns, but it complicates
min / max matching. The select doesn't have the correct type and would
require looking through the bitcasts for the real float operands.
llvm-svn: 224092
|
|
|
|
| |
llvm-svn: 224067
|