summaryrefslogtreecommitdiffstats
path: root/libclc/amdgcn/lib
Commit message (Collapse)AuthorAgeFilesLines
* amdgcn: Consolidate atomic minmax helpersJan Vesely2018-11-2711-57/+4
| | | | | | | | Removes most overrides Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewer: Aaron Watry llvm-svn: 347665
* amdgcn: Move __clc_amdgcn_s_waitcnt definition to clc fileJan Vesely2018-11-044-15/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346082
* amdgcn: Convert get_num_groups to clcJan Vesely2018-11-0413-75/+16
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346081
* amdgcn: Convert get_global_size to clcJan Vesely2018-11-0413-75/+16
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346080
* amdgcn: Convert get_local_size to clcJan Vesely2018-11-0413-75/+16
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346079
* amdgcn: Use __constant AS for amdgcn builtins.Jan Vesely2018-08-032-2/+6
| | | | | | | | Fixes build after clang r338707. Reviewer: Matthew.Arsenault@amd.com Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 338898
* Add initial support for half precision builtinsJan Vesely2018-05-172-0/+30
| | | | | | | | | | | | | | v2: fix fmax implementation use consistent checks for __CLC_FP_SIZE add missing TODOs fix whitespace in definitions.h v3: undef ZERO in modf.inc Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: Jeroen Ketema <j.ketema@xs4all.nl> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 332677
* amdgcn/fmin: Fix typos that reduced precisionJan Vesely2018-04-171-3/+3
| | | | | | | | | Not sure how these sneaked in. Fixes fminD and few other tests(fractD, cosD) on carrizo Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 330198
* amdgcn: Update datalayout after LLVM r328656Jan Vesely2018-04-054-4/+4
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 329290
* amdgcn/fmax: fcanonicalize operandsJan Vesely2018-03-082-0/+32
| | | | | | | | | v_max instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327076
* amdgcn/fmin: fcanonicalize operandsJan Vesely2018-03-082-0/+32
| | | | | | | | | v_min instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327075
* amdgcn,popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICsJan Vesely2018-03-083-0/+24
| | | | | | | | | | This is only really needed for VI+ ASICs. However, llvm would cast the value to i32 for older asics anyway. The proper fix is in LLVM-7 (r326535). Fixes CTS popcount on carrizo. Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327044
* amdgcn: Fix build after GDS/const AS swap in r325030Jan Vesely2018-02-236-10/+20
| | | | | | Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325866
* amdgcn: Fix datalayout after addition of 32bit const AS in r324747Jan Vesely2018-02-234-4/+4
| | | | | | Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325865
* amdgcn: Fix datalayout after clang r324101Jan Vesely2018-02-2317-5/+150
| | | | | | | | r324101 switched around AS numbering Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325863
* amdgcn: Add missing datalayout info to .ll filesJan Vesely2017-10-207-0/+14
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 316239
* Let get_work_dim take exactly 0 argumentsJeroen Ketema2017-10-011-1/+1
| | | | | Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 314634
* Restore support for llvm-3.9Jan Vesely2017-09-295-0/+60
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Acked-by: Aaron Watry <awatry@gmail.com> llvm-svn: 314543
* Implement cl_khr_int64_extended_atomics builtinsJan Vesely2017-09-202-0/+48
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 313811
* amdgcn,waitcnt: Add datalayout infoJan Vesely2017-09-041-0/+2
| | | | | | | | This file is only compiled for GCN which all share the same layout Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> llvm-svn: 312493
* amdgcn: rewrite barrier() using fence and clang __builtin_amdgcn_s_barrierJan Vesely2017-08-163-33/+8
| | | | | | | | | | | | | | Specs require using fences when barrier() is invoked: "The barrier function will either flush any variables stored in local memory or queue a memory fence to ensure correct ordering of memory operations to local memory." and "The barrier function will queue a memory fence to ensure correct ordering of memory operations to global memory." Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311022
* amdgcn: Implement {read_,write_,}mem_fence builtinJan Vesely2017-08-163-0/+52
| | | | | | | | | v2: add more detailed comment about waitcnt instruction Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311021
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-252-0/+22
| | | | llvm-svn: 279723
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-242-0/+22
| | | | llvm-svn: 279644
* amdgpu: Fix default case value for get_local_sizeMatt Arsenault2016-08-201-1/+1
| | | | llvm-svn: 279359
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-202-0/+22
| | | | llvm-svn: 279350
* amdgcn: Correct return types to be size_tMatt Arsenault2016-08-193-3/+3
| | | | llvm-svn: 279343
* AMDGPU: Implement get_global_offset builtinJan Vesely2016-07-222-0/+12
| | | | | | | | | | | | | Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-226-62/+34
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexpMatt Arsenault2016-07-182-0/+48
| | | | | | | It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813
* amdgcn: Use new workitem intrinsicsMatt Arsenault2016-02-173-0/+62
| | | | llvm-svn: 261042
* Split sources for amdgcn and r600Matt Arsenault2016-02-133-0/+33
Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777
OpenPOWER on IntegriCloud