summaryrefslogtreecommitdiffstats
path: root/libclc/generic/lib/SOURCES
Commit message (Collapse)AuthorAgeFilesLines
...
* Implement step builtinTom Stellard2015-03-021-0/+1
| | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. llvm-svn: 230970
* Implement smoothstep builtin v2Tom Stellard2015-03-021-0/+1
| | | | | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Fix typo in smoothstep.h llvm-svn: 230969
* Implement radians builtin v2Tom Stellard2015-03-021-0/+1
| | | | | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Move to the common/ directory llvm-svn: 230968
* Implement degrees builtin v2Tom Stellard2015-03-021-0/+1
| | | | | | | | | This has been tested with piglit, OpenCV, and the ocl conformance tests. v2: - Move to the common/ directory llvm-svn: 230967
* libclc/math: Add cospiAaron Watry2015-02-261-0/+1
| | | | | | | | | | | | | | | | | | | Ported from the libclc/amd-builtins branch v2: Rename sincos_f_piby4 to __libclc__sincosf_piby4 Add cospi(double) implementation instead of using llvm.cos Notes: The sincosD_piby4.h file is mostly the same as the builtin implementation released by AMD. The inline attribute declaration is changed, and M_PI is used instead of a constant double. Otherwise, the only difference is that the header explicitly enables the fp64 pragma. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jeroen Ketema <j.ketema@imperial.ac.uk> CC: Tom Stellard <tom@stellard.net> CC: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 230641
* Implement log10Jan Vesely2015-01-301-0/+1
| | | | | | | | v2: Use constant and multiplication instead of division v3: Use hex constants Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 227585
* Implement log1p builtinTom Stellard2014-10-071-0/+2
| | | | llvm-svn: 219230
* Implement fmodJan Vesely2014-10-051-0/+1
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 219087
* Implement async_work_group_copy builtin v3Tom Stellard2014-10-031-0/+1
| | | | | | | | | | | | | This is a simple implementation which just copies data synchronously. v2: - Use size_t. v3: - Fix possible race condition by splitting the copy among multiple work items. llvm-svn: 219008
* Implement async_work_group_strided_copy builtin v2Tom Stellard2014-10-031-0/+1
| | | | | | | | | This is a simple implementation which just copies data synchronously. v2: - Use size_t. llvm-svn: 219007
* Implement wait_group_events builtin v2Tom Stellard2014-10-031-0/+1
| | | | | | | | | This is a simple default implemetation which just calls barrier(). v2: - Only call barrier() once. llvm-svn: 219006
* atomic: Add generic atom[ic]_cmpxchgAaron Watry2014-09-161-0/+2
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217918
* atomic: Implement generic atom[ic]_xchgAaron Watry2014-09-161-0/+3
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217917
* atomic: Add generic atomic_min implementationAaron Watry2014-09-161-0/+2
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217916
* atomic: Add generic atom[ic]_xorAaron Watry2014-09-161-0/+2
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217915
* atomic: Add atom[ic]_orAaron Watry2014-09-161-0/+2
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217914
* atomics: Add generic atom[ic]_andAaron Watry2014-09-161-0/+2
| | | | | | | | Not used yet. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217913
* atomic: Add generic implementation of atom[ic]_maxAaron Watry2014-09-161-0/+2
| | | | | | | | | | Not used yet... v2: Correct int/uint behavior Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217912
* atomic: define extension functions for existing atomic implementationsAaron Watry2014-09-161-0/+4
| | | | | | | | We were missing the local versions of the atom_* before Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 217911
* math: Add tan implementationAaron Watry2014-09-101-0/+1
| | | | | | | | | | | | | | | | | Uses the algorithm: tan(x) = sin(x) / sqrt(1-sin^2(x)) An alternative is: tan(x) = sin(x) / cos(x) Which produces more verbose bitcode and longer assembly. Either way, the generated bitcode seems pretty nasty and a more optimized but still precise-enough solution is welcome. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217511
* math: Add asin implementationAaron Watry2014-09-101-0/+1
| | | | | | | | | | | | | | asin(x) = atan2(x, sqrt( 1-x^2 )) alternatively: asin(x) = PI/2 - acos(x) Use the atan2 implementation since it produces slightly shorter bitcode and R600 machine code. Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217510
* math: Add acos implementationAaron Watry2014-09-101-0/+1
| | | | | | | | | | Passes the tests that were submitted to the piglit list Tested on R600 (Pitcairn) Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 217509
* add isordered builtinJan Vesely2014-09-051-0/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217247
* add isunordered builtinJan Vesely2014-09-051-0/+1
| | | | | | | | v2: remove trailing newline Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217246
* add islessgreater builtinJan Vesely2014-09-051-0/+1
| | | | | | | | v2: remove trailing newline Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217245
* add isnormal builtinJan Vesely2014-09-051-0/+1
| | | | | | | | | v2: simplify and remove isnan leftovers remove trailing newline Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217244
* add isfinite builtinJan Vesely2014-09-051-0/+1
| | | | | | | | | v2: simplify and remove isinf leftovers remove trailing newline Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217243
* Implement isinf builtinTom Stellard2014-09-031-0/+1
| | | | llvm-svn: 217046
* Fix implementation of copysignTom Stellard2014-09-031-0/+1
| | | | | | | | | This was previously implemented with a macro and we were using __builtin_copysign(), which takes double inputs for the float version of copysign(). Reviewed-and-Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 217045
* Implement generic mad_satJan Vesely2014-09-021-0/+1
| | | | | | | | | | | | v2: Fix trailing whitespace Fix signed long overflow improve comment v3: fix typo Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 216923
* Revert "Implement generic mad_sat"Aaron Watry2014-08-231-1/+0
| | | | | | | | This reverts commit cf62eded8b623a1c10d3692d25e5882b7939f564. I didn't mean to commit this... Jan has a v3 incoming llvm-svn: 216322
* Implement generic mad_satAaron Watry2014-08-231-0/+1
| | | | | | | | | v2: Fix trailing whitespace Fix signed long overflow improve comment Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu> llvm-svn: 216320
* Implement prefetch builtinTom Stellard2014-08-201-0/+1
| | | | | | | The default implementation is a no-op. Targets should override this with their own implementations. llvm-svn: 216127
* vload/vstore: Use casts instead of scalarizing everything in CLC versionAaron Watry2014-08-201-2/+0
| | | | | | | | | | | | | | | This generates bitcode which is indistinguishable from what was hand-written for int32 types in v[load|store]_impl.ll. v4: Use vec2+scalar for vec3 load/stores to prevent corruption (per Tom) v3: Also remove unused generic/lib/shared/v[load|store]_impl.ll v2: (Per Matt Arsenault) Fix alignment issues with vector load stores Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> CC: Matt Arsenault <Matthew.Arsenault@amd.com> CC: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 216069
* relational: Add islessequal(floatN) builtinJan Vesely2014-08-011-0/+1
| | | | | | | | v2: remove the initial undef Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 214568
* relational: Add isless(floatN) builtinJan Vesely2014-08-011-0/+1
| | | | | | | | v2: remove the initial undef Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 214567
* Implement sin builtin for float typesTom Stellard2014-07-231-0/+1
| | | | | | This double version still uses @llvm.sin. llvm-svn: 213762
* Implement cos builtin for float typesTom Stellard2014-07-231-0/+2
| | | | | | The double version still uses @llvm.cos. llvm-svn: 213761
* Implement atan2 builtinTom Stellard2014-07-231-0/+1
| | | | llvm-svn: 213760
* Implement atan builtinTom Stellard2014-07-231-0/+1
| | | | llvm-svn: 213759
* relational: Implement isnotequalAaron Watry2014-07-171-0/+1
| | | | | | | | v2: Use relational macros instead of hand-rolled ones Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213320
* relational: Implement isgreaterequalAaron Watry2014-07-171-0/+1
| | | | | | | | v2: Use relational macros instead of hand-rolled macros Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213319
* relational: Implement isgreaterAaron Watry2014-07-171-0/+1
| | | | | | | | v2: Use relational macros instead of hand-rolled macros Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 213318
* relational: Implement signbitAaron Watry2014-06-251-0/+1
| | | | | | | | | | | v2 Changes: - use __builtin_signbit instead of shifting by hand - significantly improve vector shuffling - Works correctly now for signbit(float16) on radeonsi Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 211696
* Add exp10Jeroen Ketema2014-06-251-0/+1
| | | | | Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211680
* Add pownJeroen Ketema2014-06-181-0/+1
| | | | | Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211211
* math: Implement mix builtinAaron Watry2014-06-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211047
* relational: Add isequal(floatN) builtinAaron Watry2014-06-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211046
* Add all(igentype) builtinAaron Watry2014-06-161-0/+1
| | | | | | Signed-off-by: Aaron Watry <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 211045
* Implementations for exp(float) and exp(double) v2Jeroen Ketema2014-06-131-0/+1
| | | | | | | | | | | | Use separate implementations instead of a macro to ensure the constant multiplied with is of higher precision. v2: Use the correct formula, spotted by Dan Liew <daniel.liew@imperial.ac.uk> Reviewed-by: Aaron Warty <awatry@gmail.com> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 210891
OpenPOWER on IntegriCloud