summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/cttz_zero_undef.ll
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombiner][AMDGPU][X86] Turn cttz/ctlz into ↵Craig Topper2018-02-061-1/+2
| | | | | | | | | | | | cttz_zero_undef/ctlz_zero_undef if we can prove the input is never zero X86 currently has a late DAG combine after cttz/ctlz are turned into BSR+BSF+CMOV to detect this and remove the CMOV. But we should be able to do this much earlier and avoid creating the cmov all together. For the changed AMDGPU test case it appears that previously the i8 cttz was type legalized to i16 which introduced an OR with 256 in order to limit the result to 8 on the widened type. At this point the result is known to never be zero, but nothing checked that. Then operation legalization is told to promote all i16 cttz to i32. This introduces an extend and a truncate and another OR with 65536 to limit the result to 16. With the DAG combiner change we are able to prevent the creation of the second OR since the opcode will have been changed to cttz_zero_undef after the first OR. I the lack of the OR caused the instruction to change to v_ffbl_b32_sdwa Differential Revision: https://reviews.llvm.org/D42985 llvm-svn: 324427
* AMDGPU : Fix an error for the llvm.cttz implementation.Wei Ding2017-10-171-0/+3
| | | | | | Differential Revision: http://reviews.llvm.org/D39014 llvm-svn: 316037
* Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.Wei Ding2017-10-121-3/+191
| | | | | | Differential Revision: http://reviews.llvm.org/D37348 llvm-svn: 315610
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-041-6/+13
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307097
* Revert r307026, "[AMDGPU] Switch scalarize global loads ON by default"NAKAMURA Takumi2017-07-041-13/+6
| | | | | | | | | It broke a testcase. Failing Tests (1): LLVM :: CodeGen/AMDGPU/alignbit-pat.ll llvm-svn: 307054
* [AMDGPU] Switch scalarize global loads ON by defaultAlexander Timofeev2017-07-031-6/+13
| | | | | | Differential revision: https://reviews.llvm.org/D34407 llvm-svn: 307026
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-211-4/+4
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* Enable FeatureFlatForGlobal on Volcanic IslandsMatt Arsenault2017-01-241-2/+2
| | | | | | | | | | | This switches to the workaround that HSA defaults to for the mesa path. This should be applied to the 4.0 branch. Patch by Vedran Miletić <vedran@miletic.net> llvm-svn: 292982
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+71
llvm-svn: 239657
OpenPOWER on IntegriCloud