summaryrefslogtreecommitdiffstats
path: root/llvm/test/Bitcode/diglobalvariable-3.8.ll
diff options
context:
space:
mode:
authorStanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>2017-07-06 20:34:21 +0000
committerStanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>2017-07-06 20:34:21 +0000
commit9d7b1c9ddba60d1351931c9bf9ae08b1c1b18f2c (patch)
tree7ad7f2862a00db4140c41c1f38dde12876f21831 /llvm/test/Bitcode/diglobalvariable-3.8.ll
parente9b5857db3402213193f5cefdabe847b5af41a77 (diff)
downloadbcm5719-llvm-9d7b1c9ddba60d1351931c9bf9ae08b1c1b18f2c.tar.gz
bcm5719-llvm-9d7b1c9ddba60d1351931c9bf9ae08b1c1b18f2c.zip
[AMDGPU] Always use rcp + mul with fast math
Regardless of relaxation options such as -cl-fast-relaxed-math we are producing rather long code for fdiv via amdgcn_fdiv_fast intrinsic. This intrinsic is used to replace fdiv with 2.5ulp metadata and does not handle denormals, thus believed to be fast. An fdiv instruction can also have fast math flag either by itself or together with fpmath metadata. Clang used with a relaxation flag always produces both metadata and fast flag: %div = fdiv fast float %v, %0, !fpmath !12 !12 = !{float 2.500000e+00} Current implementation ignores fast flag and favors metadata. An instruction with just fast flag would be lowered to a fastest rcp + mul, but that never happen on practice because of described mutual clang and BE behavior. This change allows an "fdiv fast" to be always lowered as rcp + mul. Differential Revision: https://reviews.llvm.org/D34844 llvm-svn: 307308
Diffstat (limited to 'llvm/test/Bitcode/diglobalvariable-3.8.ll')
0 files changed, 0 insertions, 0 deletions
OpenPOWER on IntegriCloud