bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[OPENMP][NVPTX]Mark more functions as always_inline for better	Alexey Bataev	2019-05-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	performance. Internally generated functions must be marked as always_inlines in most cases. Patch marks some extra reduction function + outlined parallel functions as always_inline for better performance, but only if the optimization is requested. llvm-svn: 361269
*	[OPENMP][NVPTX]Use new functions from the runtime library.	Alexey Bataev	2019-01-04	1	-2/+2
\| \| \| \| \| \|	Updated codegen to use the new functions from the runtime library. llvm-svn: 350415
*	[OPENMP][NVPTX]Use __kmpc_barrier_simple_spmd(nullptr, 0) instead of	Alexey Bataev	2019-01-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	nvvm_barrier0. Use runtime functions instead of the direct call to the nvvm intrinsics. It allows to prevent some dangerous LLVM optimizations, that breaks the code for the NVPTX target. llvm-svn: 350328
*	[OPENMP][NVPTX]Emit shared memory buffer for reduction as 128 bytes	Alexey Bataev	2018-12-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	buffer. Seems to me, nvlink has a bug with the proper support of the weakly linked symbols. It does not allow to define several shared memory buffer with the different sizes even with the weak linkage. Instead we always use 128 bytes buffer to prevent nvlink from the error message emission. llvm-svn: 349540
*	[OPENMP][NVPTX]Emit correct reduction code for teams/parallel	Alexey Bataev	2018-11-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	reductions. Fixed previously committed code for the reduction support in teams/parallel constructs taking into account new design of the NVPTX support in the compiler. Teams reduction are not fully functional yet, it is going to be fixed in the following patches. llvm-svn: 347081
*	[OPENMP][NVPTX]Allow to use shared memory for the	Alexey Bataev	2018-11-09	1	-5/+6
\| \| \| \| \| \| \| \| \| \|	target\|teams\|distribute variables. If the total size of the variables, declared in target\|teams\|distribute regions, is less than the maximal size of shared memory available, the buffer is allocated in the shared memory. llvm-svn: 346507
*	[OPENMP][NVPTX]Improve emission of the globalized variables for	Alexey Bataev	2018-11-02	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target/teams/distribute regions. Target/teams/distribute regions exist for all the time the kernel is executed. Thus, if the variable is declared in their context and then escape it, we can allocate global memory statically instead of allocating it dynamically. Patch captures all the globalized variables in target/teams/distribute contexts, merges them into the records, one per each target region. Those records are then joined into the union, one per compilation unit (to save the global memory). Those units are organized into 2 x dimensional arrays, where the first dimension is the number of blocks per SM and the second one is the number of SMs. Runtime functions manage this global memory space between the executing teams. llvm-svn: 345978
*	[OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.	Gheorghe-Teodor Bercea	2018-10-29	1	-2/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch enables the choosing of the default schedule for parallel for loops even in non-SPMD cases. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: jholewinski, guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53443 llvm-svn: 345507
*	[NFC][OpenMP] Add new test for parallel for code generation.	Gheorghe-Teodor Bercea	2018-10-26	1	-0/+101
	Summary: This is a simple test of the parallel for code generation. It will be used to showcase the change introduced by patch D53443. Reviewers: ABataev, caomhin Reviewed By: ABataev Subscribers: guansong, cfe-commits Differential Revision: https://reviews.llvm.org/D53772 llvm-svn: 345417