summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/NVPTX/fast-math.ll
Commit message (Collapse)AuthorAgeFilesLines
* [NVPTX] Enable combineRepeatedFPDivisors for NVPTX.Justin Lebar2017-02-031-0/+44
| | | | | | | | | | Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D29477 llvm-svn: 294011
* [NVPTX] Compute approx sqrt as 1/rsqrt(x) rather than x*rsqrt(x).Justin Lebar2017-01-311-2/+2
| | | | | | | | | | x*rsqrt(x) returns NaN for x == 0, whereas 1/rsqrt(x) returns 0, as desired. Verified that the particular nvptx approximate instructions here do in fact return 0 for x = 0. llvm-svn: 293713
* [NVPTX] Implement NVPTXTargetLowering::getSqrtEstimate.Justin Lebar2017-01-311-5/+71
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This lets us lower to sqrt.approx and rsqrt.approx under more circumstances. * Now we emit sqrt.approx and rsqrt.approx for calls to @llvm.sqrt.f32, when fast-math is enabled. Previously, we only would emit it for calls to @llvm.nvvm.sqrt.f. (With this patch we no longer emit sqrt.approx for calls to @llvm.nvvm.sqrt.f; we rely on intcombine to simplify llvm.nvvm.sqrt.f into llvm.sqrt.f32.) * Now we emit the ftz version of rsqrt.approx when ftz is enabled. Previously, we only emitted rsqrt.approx when ftz was disabled. Reviewers: hfinkel Subscribers: llvm-commits, tra, jholewinski Differential Revision: https://reviews.llvm.org/D28508 llvm-svn: 293605
* [NVPTX] Only lower sin/cos to approximate instructions if unsafe math is ↵Artem Belevich2017-01-131-0/+17
| | | | | | | | | | | | | | allowed. Previously we'd always lower @llvm.{sin,cos}.f32 to {sin.cos}.approx.f32 instruction even when unsafe FP math was not allowed. Clang-generated IR is not affected by this as it uses precise sin/cos from CUDA's libdevice when unsafe math is disabled. Differential Revision: https://reviews.llvm.org/D28619 llvm-svn: 291936
* [TM] Restore default TargetOptions in TargetMachine::resetTargetOptions.Justin Lebar2017-01-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously if you had * a function with the fast-math-enabled attr, followed by * a function without the fast-math attr, the second function would inherit the first function's fast-math-ness. This means that mixing fast-math and non-fast-math functions in a module was completely broken unless you explicitly annotated every non-fast-math function with "unsafe-fp-math"="false". This appears to have been broken since r176986 (March 2013), when the resetTargetOptions function was introduced. This patch tests the correct behavior as best we can. I don't think I can test FPDenormalMode and NoTrappingFPMath, because they aren't used in any backends during function lowering. Surprisingly, I also can't find any uses at all of LessPreciseFPMAD affecting generated code. The NVPTX/fast-math.ll test changes are an expected result of fixing this bug. When FMA is disabled, we emit add as "add.rn.f32", which prevents fma combining. Before this patch, fast-math was enabled in all functions following the one which explicitly enabled it on itself, so we were emitting plain "add.f32" where we should have generated "add.rn.f32". Reviewers: mkuper Subscribers: hfinkel, majnemer, jholewinski, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D28507 llvm-svn: 291618
* [NVPTX] Add CHECK-LABEL where appropriate to fast-math.ll test.Justin Lebar2017-01-101-9/+4
| | | | | | | | Also fix up whitespace. Test-only change. llvm-svn: 291617
* [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and appendJustin Holewinski2013-07-221-0/+43
.ftz to instructions if the nvptx-f32ftz attribute is set to "true" llvm-svn: 186820
OpenPOWER on IntegriCloud