| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
If there is a single use constant, it can be folded into the
min/max, but not into med3.
llvm-svn: 342443
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a parameter for testing specifically for
sNaNs - at least one instruction pattern on AMDGPU
needs to check specifically for this.
Also handle more cases, and add a target hook
for custom nodes, similar to the hooks for known
bits.
llvm-svn: 338910
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add gfx704
- Change bonaire to gfx704
- Remove gfx804
- Remove gfx901
- Remove gfx903
Differential Revision: https://reviews.llvm.org/D40046
llvm-svn: 320194
|
|
|
|
| |
llvm-svn: 318004
|
|
|
|
| |
llvm-svn: 309470
|
|
|
|
|
|
|
|
| |
Immediates can be folded as long as the immediate is a vreg.
Also undo commuting instructions if it didn't fold an immediate.
llvm-svn: 307575
|
|
|
|
|
|
|
|
|
| |
Turn expensive 64 bit shift into 32 bit if shift does not overflow int:
shl (ext x) => zext (shl x)
Differential Revision: https://reviews.llvm.org/D33367
llvm-svn: 303569
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
llvm-svn: 298444
|
|
|
|
| |
llvm-svn: 296401
|
|
|
|
|
|
|
|
|
|
| |
I think this is safe as long as no inputs are known to ever
be nans.
Also add an intrinsic for fmed3 to be able to handle all safe
math cases.
llvm-svn: 293598
|
|
|
|
| |
llvm-svn: 292328
|
|
|
|
|
|
| |
Also fix v_mac.ll not testing right thing for fneg
llvm-svn: 275129
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PeepholeOptimizer cleans up redundant copies, which makes
the operand folding more effective.
shader-db stats:
Totals:
SGPRS: 34200 -> 34336 (0.40 %)
VGPRS: 22118 -> 21655 (-2.09 %)
Code Size: 632144 -> 633460 (0.21 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 10240 -> 11264 (10.00 %) bytes per wave
Max Waves: 8822 -> 8918 (1.09 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 7704 -> 7840 (1.77 %)
VGPRS: 5169 -> 4706 (-8.96 %)
Code Size: 234444 -> 235760 (0.56 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Scratch: 0 -> 1024 (0.00 %) bytes per wave
Max Waves: 1188 -> 1284 (8.08 %)
Wait states: 0 -> 0 (0.00 %)
Increases:
SGPRS: 35 (0.01 %)
VGPRS: 1 (0.00 %)
Code Size: 59 (0.02 %)
LDS: 0 (0.00 %)
Scratch: 1 (0.00 %)
Max Waves: 48 (0.02 %)
Wait states: 0 (0.00 %)
Decreases:
SGPRS: 26 (0.01 %)
VGPRS: 54 (0.02 %)
Code Size: 68 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)
Max Waves: 4 (0.00 %)
Wait states: 0 (0.00 %)
llvm-svn: 266378
|
|
|
|
| |
llvm-svn: 259090
|
|
llvm-svn: 259089
|