teak-llvm

mirror of https://github.com/Gericom/teak-llvm.git synced 2025-06-24 22:08:57 -04:00

Author	SHA1	Message	Date
Alexey Bataev	a4fa0b880a	[OPENMP] General code improvements. llvm-svn: 330140	2018-04-16 17:59:34 +00:00
Alexey Bataev	b7f3cba84c	[OPENMP, NVPTX] Emit correct thread id. We emitted fake thread id for the outined function in NVPTX codegen. Patch adds emission of the real thread id. llvm-svn: 327867	2018-03-19 17:04:07 +00:00
Alexey Bataev	c99042ba97	[OPENMP, NVPTX] Improve globalization of the variables captured by value. If the variable is captured by value and the corresponding parameter in the outlined function escapes its declaration context, this parameter must be globalized. To globalize it we need to get the address of the original parameter, load the value, store it to the global address and use this global address instead of the original. Patch improves globalization for parallel\|teams regions + functions in declare target regions. llvm-svn: 327654	2018-03-15 18:10:54 +00:00
Samuel Antao	1168d63cf9	[OpenMP] Use fopenmp prefix for all options introduced by the offloading implementation. Summary: This patch changes the options used by offloading to start with -fopenmp instead of -fomp. This makes the option naming more consistent and materializes a suggestion by Richard Smith in http://reviews.llvm.org/D9888. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, ABataev Subscribers: kkwli0, cfe-commits, caomhin Differential Revision: http://reviews.llvm.org/D21841 llvm-svn: 274283	2016-06-30 21:22:08 +00:00
Alexey Bataev	7ace49dff1	[OPENMP] Pass scalar firstprivate vars by value. For better performance and to unify code with offloading part we pass scalar firstprivate values by value, instead of by reference. It will remove some extra copying operations. llvm-svn: 269751	2016-05-17 08:55:33 +00:00
Carlo Bertolli	c687225b43	[OPENMP] Codegen for teams directive for NVPTX This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it: Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region. Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime. Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region. http://reviews.llvm.org/D17963 llvm-svn: 265304	2016-04-04 15:55:02 +00:00

6 Commits