summaryrefslogtreecommitdiffstats
path: root/libclc/amdgcn/lib/SOURCES
Commit message (Collapse)AuthorAgeFilesLines
* amdgcn: Convert get_num_groups to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346081
* amdgcn: Convert get_global_size to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346080
* amdgcn: Convert get_local_size to clcJan Vesely2018-11-041-1/+1
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-By: Aaron Watry <awatry@gmail.com> llvm-svn: 346079
* amdgcn/fmax: fcanonicalize operandsJan Vesely2018-03-081-0/+1
| | | | | | | | | v_max instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327076
* amdgcn/fmin: fcanonicalize operandsJan Vesely2018-03-081-0/+1
| | | | | | | | | v_min instruction needs canonicalized operands. Passes CTS on carrizo Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327075
* amdgcn,popcount: Workaround broken llvm.ctpop intrinsic on some GCN ASICsJan Vesely2018-03-081-0/+1
| | | | | | | | | | This is only really needed for VI+ ASICs. However, llvm would cast the value to i32 for older asics anyway. The proper fix is in LLVM-7 (r326535). Fixes CTS popcount on carrizo. Reviewer: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 327044
* amdgcn: Fix datalayout after clang r324101Jan Vesely2018-02-231-1/+0
| | | | | | | | r324101 switched around AS numbering Acked-by: Aaron Watry <awatry@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 325863
* Implement cl_khr_int64_extended_atomics builtinsJan Vesely2017-09-201-0/+1
| | | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 313811
* amdgcn: rewrite barrier() using fence and clang __builtin_amdgcn_s_barrierJan Vesely2017-08-161-1/+1
| | | | | | | | | | | | | | Specs require using fences when barrier() is invoked: "The barrier function will either flush any variables stored in local memory or queue a memory fence to ensure correct ordering of memory operations to local memory." and "The barrier function will queue a memory fence to ensure correct ordering of memory operations to global memory." Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311022
* amdgcn: Implement {read_,write_,}mem_fence builtinJan Vesely2017-08-161-0/+2
| | | | | | | | | v2: add more detailed comment about waitcnt instruction Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Aaron Watry <awatry@gmail.com> Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 311021
* amdgcn: Fix return type of get_num_groupsMatt Arsenault2016-08-251-0/+1
| | | | llvm-svn: 279723
* amdgcn: Fix return type for get_global_sizeMatt Arsenault2016-08-241-0/+1
| | | | llvm-svn: 279644
* amdgcn: Fix get_local_size IR return typeMatt Arsenault2016-08-201-0/+1
| | | | llvm-svn: 279350
* AMDGPU: Implement get_global_offset builtinJan Vesely2016-07-221-0/+1
| | | | | | | | | | | | | Also fix get_global_id to consider offset No idea how to add this for ptx, so they are stuck with the old get_global_id implementation. v2: split to a separate patch v3: Switch R600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276443
* AMDGPU: Use clang intrinsics for workitem builtinsJan Vesely2016-07-221-2/+3
| | | | | | | | | | | v2: split into 2 patches use clang builtins for other intrinsics as well v3: Fix warnings Switch r600 to use implictarg.ptr Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 276442
* Replace llvm.AMDGPU.ldexp with llvm.amdgcn.ldexpMatt Arsenault2016-07-181-0/+1
| | | | | | | It didn't really work on r600 to begin with, which should get its own intrinsic. llvm-svn: 275813
* amdgcn: Use new workitem intrinsicsMatt Arsenault2016-02-171-0/+2
| | | | llvm-svn: 261042
* Split sources for amdgcn and r600Matt Arsenault2016-02-131-0/+1
Most files remain in a common amdgpu directory. Also switches barriers to to use convergent, and use llvm.amdgcn.s.barrier. This now requires 3.9/trunk to build amdgcn. llvm-svn: 260777
OpenPOWER on IntegriCloud