| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
| |
BypassSlowDiv is used by codegen prepare to insert a run-time
check to see if the operands to a 64-bit division are really 32-bit
values and if they are it will do 32-bit division instead.
This is not useful for R600, which has predicated control flow since
both the 32-bit and 64-bit paths will be executed in most cases. It
also increases code size which can lead to more instruction cache
misses.
llvm-svn: 218252
|
|
|
|
|
|
|
| |
ISD::MUL and ISD:UMULO are the same except that UMULO sets an overflow
bit. Since we aren't using the overflow bit, we should use ISD::MUL.
llvm-svn: 218251
|
|
|
|
| |
llvm-svn: 218250
|
|
|
|
| |
llvm-svn: 218223
|
|
|
|
| |
llvm-svn: 218222
|
|
|
|
| |
llvm-svn: 218221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In r217636, the value stored in KernelInfo.Num[VS]GPRSs was changed from
the highest GPR index used to the number of gprs in order to be
consistent with the name of the variable.
The code writing the config values still assumed that the value in this
variable was the highest GPR index used, which caused the compiler to
over report the number of GPRs being used.
https://bugs.freedesktop.org/show_bug.cgi?id=84089
llvm-svn: 218150
|
|
|
|
|
|
| |
Just do the left shift as unsigned to avoid the UB.
llvm-svn: 218092
|
|
|
|
|
|
| |
GCC in r218059, so these changes are no longer required.
llvm-svn: 218062
|
|
|
|
|
|
|
| |
I'm not sure what the hardware actually does, so don't
bother trying to fold it for now.
llvm-svn: 218057
|
|
|
|
|
|
| |
getSubtargetImpl from the base class. NFC.
llvm-svn: 218050
|
|
|
|
|
|
|
|
|
|
|
| |
shim between the TargetTransformInfo immutable pass and the Subtarget
via the TargetMachine and Function. Migrate a single call from
BasicTargetTransformInfo as an example and provide shims where TargetMachine
begins taking a Function to determine the subtarget.
No functional change.
llvm-svn: 218004
|
|
|
|
|
|
|
|
|
|
| |
Since read2 / write2 are emitted for 4-byte aligned 8-byte
accesses, these are seen by the scheduler.
The DAG scheduler is semi-deprecated, so just
ignore these for now.
llvm-svn: 217969
|
|
|
|
| |
llvm-svn: 217968
|
|
|
|
|
|
| |
This bug was reported by UBSan.
llvm-svn: 217967
|
|
|
|
|
|
|
|
|
|
| |
Only 1 decimal place should be printed for inline immediates.
Other constants should be hex constants.
Does not include f64 tests because folding those inline
immediates currently does not work.
llvm-svn: 217964
|
|
|
|
|
|
|
|
| |
Instructions are now generally selected to the e64 forms originally,
and shrunk down later. Rename foldOperands to legalizeOperands,
since that's really most of what it tries to do.
llvm-svn: 217959
|
|
|
|
| |
llvm-svn: 217892
|
|
|
|
|
|
|
|
| |
Add some more tests to make sure better operand
choices are still made. Leave some cases that seem
to have no reason to ever be e64 alone.
llvm-svn: 217789
|
|
|
|
| |
llvm-svn: 217777
|
|
|
|
| |
llvm-svn: 217776
|
|
|
|
|
|
|
| |
There is already code trying to use it for getting
the offset.
llvm-svn: 217775
|
|
|
|
| |
llvm-svn: 217730
|
|
|
|
|
|
|
| |
The register numbers start at 0, so if only 1 register
was used, this was reported as 0.
llvm-svn: 217636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactored the R600_LDS_1A2D class a bit to get it to actually work.
It seemed to be previously unused and broken.
We also have to disable the conversion to the noret variant for now in
R600ISelLowering because the getLDSNoRetOp method only handles 1A1D LDS ops.
Someone can feel free to modify the AMDGPU::getLDSNoRetOp method to
work for more than 1A1D variants of LDS operations. It's being left as a
future TODO for now.
Signed-off-by: Aaron Watry <awatry at gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217596
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217594
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217593
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217592
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217591
|
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
llvm-svn: 217590
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was only present for SI before.
Cayman may still be missing, but I am unable to test that currently.
v2: Don't create atomicrmw max tests in separate file
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com>
CC: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 217589
|
|
|
|
|
|
|
| |
The lost chain resulting in earlier side effecting nodes
being deleted.
llvm-svn: 217561
|
|
|
|
|
|
|
|
|
|
|
| |
Need to convert the 64 element offset into bytes, not just the element
size like the normal case instructions.
Noticed by inspection. This can't be hit now because
st64 instructions aren't emitted during instruction selection,
and the post-RA scheduler isn't enabled.
llvm-svn: 217560
|
|
|
|
| |
llvm-svn: 217553
|
|
|
|
|
|
|
|
|
|
|
| |
names controlling this variable.
"Unroll" is not the appropriate name for this variable. Clang already uses
the term "interleave" in pragmas and metadata for this.
Differential Revision: http://reviews.llvm.org/D5066
llvm-svn: 217528
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assert in scheduler from an inserted copy_to_regclass from
a constant.
This only seems to break sometimes when a constant initializer
address is forced into VGPRs in a non-entry block. No test
since the only case I've managed to hit only happens with a future
patch, and that case will also not be a problem once scalar instructions
are used in non-entry blocks.
llvm-svn: 217380
|
|
|
|
| |
llvm-svn: 217379
|
|
|
|
|
|
|
|
| |
Only handles LDS atomics for now, and will be used
to replace atomics with no uses with the no return
versions.
llvm-svn: 217378
|
|
|
|
| |
llvm-svn: 217323
|
|
|
|
|
|
|
| |
This fixes hitting the same negative base offset problem
that was already fixed for regular loads and stores.
llvm-svn: 217256
|
|
|
|
|
|
|
|
| |
round halfway cases away from zero
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
llvm-svn: 217250
|
|
|
|
|
|
|
|
|
|
| |
We must constrain the destination register class of legalized operands
to a VGPR class or else the illegal operand may be folded back into
the instruction by the register coalescer.
This fixes a bug in add.ll that will be uncovered by future commits.
llvm-svn: 217249
|
|
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=83416
llvm-svn: 217248
|
|
|
|
| |
llvm-svn: 217109
|
|
|
|
|
|
|
| |
Also fix bug this exposed where when legalizing an immediate
operand, a v_mov_b32 would be created with a VSrc dest register.
llvm-svn: 217108
|
|
|
|
| |
llvm-svn: 217041
|
|
|
|
|
|
|
|
|
|
| |
This fixes a crash in the OpenCV test:
ImgprocWarpResizeArea/Resize.Mat/16
There is no test case for this, because this failure depends on a
specific ordering of the loads, which could easily change.
llvm-svn: 217040
|
|
|
|
|
|
| |
No functionality change. Changes made by clang-tidy + some manual cleanup.
llvm-svn: 217028
|
|
|
|
|
|
|
|
| |
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reinstates commits r215111, 215115, 215116, 215117, 215136.
llvm-svn: 216982
|
|
|
|
| |
llvm-svn: 216823
|