summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/clamp.ll
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: Replace i64 add/sub loweringMatt Arsenault2017-11-151-1/+2
| | | | | | | | | | | | | | | Use VOP3 add/addc like usual. This has some tradeoffs. Inline immediates fold a little better, but other constants are worse off. SIShrinkInstructions could be made smarter to handle these cases. This allows us to avoid selecting scalar adds where we need to track the carry in scc and replace its users. This makes it easier to use the carryless VALU adds. llvm-svn: 318340
* AMDGPU: Do not fold clamp instructions when sources are differentMatt Arsenault2017-10-051-0/+22
| | | | | | Patch by hakzsam (Samuel Pitoiset) llvm-svn: 314951
* AMDGPU: Select clamp pattern with v2f16Matt Arsenault2017-08-301-34/+190
| | | | llvm-svn: 312087
* [AMDGPU] Remove getBidirectionalReasonRankStanislav Mekhanoshin2017-03-111-2/+2
| | | | | | | | | | | | | | This method inverts the Reason field of a scheduling candidate. It does right comparison between RegCritical and RegExcess, but everything else is broken. In fact it can prefer less strong reason such as Weak over RegCritical because Weak > -RegCritical. The CandReason enum is properly sorted, so just remove artificial ranking. Differential Revision: https://reviews.llvm.org/D30557 llvm-svn: 297536
* AMDGPU: Use clamp with f64Matt Arsenault2017-02-221-6/+3
| | | | llvm-svn: 295908
* AMDGPU: Fold FP clamp as modifier bitMatt Arsenault2017-02-221-9/+6
| | | | | | | | | | | The manual is unclear on the details of this. It's not clear to me if denormals are not allowed with clamp, or if that is only omod. Not allowing denorms for fp16 or fp64 isn't useful so I also question if that is really a restriction. Same with whether this is valid without IEEE mode enabled. llvm-svn: 295905
* AMDGPU: Redefine clamp node as clamp 0.0-1.0Matt Arsenault2017-02-211-0/+535
Change implementation to use max instead of add. min/max/med3 do not flush denormals regardless of the mode, so it is OK to use it whether or not they are enabled. Also allow using clamp with f16, and use knowledge of dx10_clamp. llvm-svn: 295788
OpenPOWER on IntegriCloud