| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch reorganizes the loop scheduling code in order to allow hierarchical
scheduling to use it more effectively. In particular, the goal of this patch
is to separate the algorithmic parts of the scheduling from the thread
logistics code.
Moves declarations & structures to kmp_dispatch.h for easier access in
other files. Extracts the algorithmic part of __kmp_dispatch_init() and
__kmp_dispatch_next() into __kmp_dispatch_init_algorithm() and
__kmp_dispatch_next_algorithm(). The thread bookkeeping logic is still kept in
__kmp_dispatch_init() and __kmp_dispatch_next(). This is done because the
hierarchical scheduler needs to access the scheduling logic without the
bookkeeping logic. To prepare for new pointer in dispatch_private_info_t, a
new flags variable is created which stores the ordered and nomerge flags instead
of them being in two separate variables. This will keep the
dispatch_private_info_t structure the same size.
Differential Revision: https://reviews.llvm.org/D47961
llvm-svn: 336568
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are preliminary changes that attempt to use C++11 Atomics in the runtime.
We are expecting better portability with this change across architectures/OSes.
Here is the summary of the changes.
Most variables that need synchronization operation were converted to generic
atomic variables (std::atomic<T>). Variables that are updated with combined CAS
are packed into a single atomic variable, and partial read/write is done
through unpacking/packing
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D47903
llvm-svn: 336563
|
|
|
|
|
|
|
|
|
| |
ompt/misc/api_calls_from_other_thread.cpp
ompt/misc/interoperability.cpp
Differential Revision: https://reviews.llvm.org/D48984
llvm-svn: 336438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The flag "--no-as-needed" is not recognized by the linker on macOS making the following tests fail:
ompt/loadtool/tool_available/tool_available.c
ompt/loadtool/tool_not_available/tool_not_available.c
This patch removes this flag for macOS and adds it only for Linux and Windows.
I tested it on Ubuntu 16.04 and macOS HighSierra, with Clang/LLVM 6.0.1 and OpenMP trunk.
This solution was also discussed in the OpenMP-dev mailing list.
Patch provided by Simone Atzeni
Differential Revision: https://reviews.llvm.org/D48888
llvm-svn: 336327
|
|
|
|
|
|
|
|
|
|
|
| |
The testcase potentially fails when a thread is reused.
The added synchronization makes sure this does not happen.
Patch provided by Simon Convent
Differential Revision: https://reviews.llvm.org/D48932
llvm-svn: 336326
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling with icc, there is a problem with reenter frame addresses in
parallel_begin callbacks in the interoperability.c testcase. (The address is
not available. thus NULL)
Using alloca() forces availability of the frame pointer.
Patch provided by Simon Convent
Differential Revision: https://reviews.llvm.org/D48282
llvm-svn: 336088
|
|
|
|
|
|
|
|
|
|
|
| |
Several runtime entry points have not been tested from non-OpenMP threads. This
adds tests to an existing testcase. While at it, the testcase was reformatted
Patch provided by Simon Convent
Differential Revision: https://reviews.llvm.org/D48124
llvm-svn: 336087
|
|
|
|
|
|
|
|
|
|
|
| |
Especially the thread_end callback has not been tested before.
This adds a testcase for nested and non-nested threads.
Patch provided by Simon Convent
Differential Revision: https://reviews.llvm.org/D47824
llvm-svn: 336086
|
|
|
|
|
|
|
|
|
| |
The current implementation always provides the thread-num for the current
parallel region. This patch fixes the behavior for ancestor levels >0.
Differential Revision: https://reviews.llvm.org/D46533
llvm-svn: 336085
|
|
|
|
| |
llvm-svn: 335138
|
|
|
|
| |
llvm-svn: 334335
|
|
|
|
|
|
|
|
| |
Rename ompt_wait_id to omp_wait_id, as defined in the spec.
Differential Revision: https://reviews.llvm.org/D46530
llvm-svn: 333368
|
|
|
|
|
|
|
|
| |
Rename ompt_frame_t to omp_frame_t, as defined in the spec.
Differential Revision: https://reviews.llvm.org/D43568
llvm-svn: 333367
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upcoming changes to FileCheck will modify CHECK-DAG to not match
overlapping regions of the input. This test was found to be affected
because it expects to find four threads to invoke events of type
ompt_event_implicit_task_begin. It turns out this is wrong because
OMP_THREAD_LIMIT is set to 2, so there are only two threads. The
rest of the test got it right so it went unnoticed until now.
(Rewrite test and apply clang-format to it as discussed in the past.)
Differential Revision: https://reviews.llvm.org/D47119
llvm-svn: 333361
|
|
|
|
|
|
|
|
|
|
| |
Introduce OPENMP_INSTALL_LIBDIR and use in all install() commands.
This also fixes installation of libomptarget-nvptx that previously
didn't honor {OPENMP,LLVM}_LIBDIR_SUFFIX.
Differential Revision: https://reviews.llvm.org/D47130
llvm-svn: 333284
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
implicit_task_end callbacks in nested parallel regions did not always give the
correct thread_num, since the inner parallel region may have already been
finalized.
Now, the thread_num is stored at the beginning of the implicit task and
retrieved at the end, whenever necessary.
A testcase was added as well.
Differential Revision: https://reviews.llvm.org/D46260
llvm-svn: 331632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The api_calls_misc.c testcase tests the following api calls:
ompt_get_callback()
ompt_get_state()
ompt_enumerate_states()
ompt_enumerate_mutex_impls()
These have not been tested previously.
The api_calls.c testcase has been renamed to api_calls_places.c because it only tests api calls that are related to places.
Differential Revision: https://reviews.llvm.org/D42523
llvm-svn: 331631
|
|
|
|
|
|
|
| |
Removed environment setting in RUN: line that was being ignored anyways.
Changed a few specific checks to "any number"
llvm-svn: 331212
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This line
(https://github.com/llvm-mirror/openmp/blob/0ed912c7a798f5c4f65f8bb6b492e07fab7f4cea/runtime/src/kmp_gsupport.cpp#L1459)
added in D45327 (rL330282) causes a compilation failure.
Reviewers: jlpeyton
Subscribers: guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D45786
llvm-svn: 330299
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the affinity API reports garbage for the initial place list and any
thread's place lists when using KMP_AFFINITY=none|compact|scatter.
This patch does two things:
for KMP_AFFINITY=none, Creates a one entry table for the places, this way, the
initial place list is just a single place with all the proc ids in it. We also
set the initial place of any thread to 0 instead of KMP_PLACE_ALL so that the
thread reports that single place (place 0) instead of garbage (-1) when using
the affinity API.
When non-OMP_PROC_BIND affinity is used
(including KMP_AFFINITY=compact|scatter), a thread's place list is populated
correctly. We assume that each thread is assigned to a single place. This is
implemented in two of the affinity API functions
Differential Revision: https://reviews.llvm.org/D45527
llvm-svn: 330283
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces GOMP_taskloop to our API. It adds GOMP_4.5 to our
version symbols. Being a wrapper around __kmpc_taskloop, the function
creates a task with the loop bounds properly nested in the shareds so that
the GOMP task thunk will work properly. Also, the firstprivate copy constructors
are properly handled using the __kmp_gomp_task_dup() auxiliary function.
Currently, only linear spawning of tasks is supported
for the GOMP_taskloop interface.
Differential Revision: https://reviews.llvm.org/D45327
llvm-svn: 330282
|
|
|
|
| |
llvm-svn: 329928
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes the unnecessary lock operation on __kmp_initz_lock inside
the __kmp_atfork_child() function for Linux; the lock variable is initialized
in the same function later.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D44949
llvm-svn: 328900
|
|
|
|
| |
llvm-svn: 328575
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The summarizeStats.py script processes raw data provided by the
instrumented (stats-gathering) OpenMP* runtime library. It provides:
1) A radar chart which plots counters as frequency (per GigaTick) of use within
the program. The frequencies are plotted as log10, however values less than
one are kept as it is and represented in red color. This was done to help
visualize the differences better.
2) Pie charts separating total time as compute and non-compute. The compute and
non-compute times have their own pie charts showing the constructs that
contributed to them. The percentages listed are with respect to the total
time.
3) '.csv' file with percentage of time spent within the different constructs.
The script can be used as:
$ python $PATH_TO_SCRIPT/summarizeStats.py instrumented1.csv instrumented2.csv
Patch by Taru Doodi
Differential Revision: https://reviews.llvm.org/D41838
llvm-svn: 328568
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44793
llvm-svn: 328228
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added settings code to read OMP_TARGET_OFFLOAD environment variable. Added
target-offload-var ICV as __kmp_target_offload, set via OMP_TARGET_OFFLOAD,
if available, otherwise defaulting to DEFAULT. Valid values for the ICV are
specified as enum values {0,1,2} for disabled, default, and mandatory. An
internal API access function __kmpc_get_target_offload is provided.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D44577
llvm-svn: 328046
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44637
llvm-svn: 327875
|
|
|
|
|
|
|
|
| |
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D41914
llvm-svn: 326733
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D44019
llvm-svn: 326728
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have to ensure that the runtime is initialized _before_ waiting
for the two started threads to guarantee that the master threads
post their ompt_event_thread_begin before the worker threads. This
is not guaranteed in the parallel region where one worker thread
could start before the other master thread has invoked the callback.
The problem did not happen with Clang becauses the generated code
calls __kmpc_global_thread_num() and cashes its result for functions
that contain OpenMP pragmas.
Differential Revision: https://reviews.llvm.org/D43882
llvm-svn: 326435
|
|
|
|
|
|
|
|
|
|
| |
This is similar to D43882. The runtime needs to be initialized before calling print_ids(0)
http://lab.llvm.org:8011/builders/openmp-gcc-x86_64-linux-debian/builds/60
Differential Revision: https://reviews.llvm.org/D43897
llvm-svn: 326428
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The thread_num parameter of ompt_get_task_info() was not being used previously,
but need to be set.
The print_task_type() function (form the task-types.c testcase) was merged into
the print_ids() function (in callback.h). Testing of ompt_get_task_info() was
added to the task-types.c testcase. It was not tested extensively previously.
Differential Revision: https://reviews.llvm.org/D42472
llvm-svn: 326338
|
|
|
|
|
|
|
|
|
|
|
| |
The main change of this patch is to insert {{.*}} in current_address=[[RETURN_ADDRESS_END]].
This is needed to match any of the alternatively printed addresses.
Additionally, clang-format is applied to the two tests.
Differential Revision: https://reviews.llvm.org/D43115
llvm-svn: 326312
|
|
|
|
|
|
|
|
|
| |
This is required to be NULL for implicit barriers at the end of a
parallel region. Noticed in review of D43191.
Differential Revision: https://reviews.llvm.org/D43308
llvm-svn: 325922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compiler inlines the user code in the task. Check for that case at
runtime by comparing the frame addresses and print the expected exit
address.
Also showcase how I think the OMPT tests could be reformatted to match
LLVM's code style. In my opinion it would be great to that kind of change
to all tests that need to be touched for whatever reason...
Differential Revision: https://reviews.llvm.org/D43191
llvm-svn: 325921
|
|
|
|
|
|
|
|
| |
Applying clang-format to the /runtime/src/ folder
Differential Revision: https://reviews.llvm.org/D42169
llvm-svn: 325424
|
|
|
|
|
|
|
|
| |
Test whether OMPT-callbacks for two threads that initiate a parallel region are correct.
Differential Revision: https://reviews.llvm.org/D41942
llvm-svn: 325423
|
|
|
|
|
|
|
|
|
|
| |
Only use ompt_ functions when testing OMPT in api_calls testcase.
Add size parameter to print_list.
Fix small bug in implementation of ompt_get_partition_place_nums(): return correct length.
Differential Revision: https://reviews.llvm.org/D42162
llvm-svn: 325422
|
|
|
|
|
|
|
|
|
| |
This affects all outlined functions, not just tasks! Only show warning
when using Clang 5.0 or later.
Differential Revision: https://reviews.llvm.org/D43190
llvm-svn: 325131
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tests the search for tools as defined in the spec. The OMP_TOOL_LIBRARIES
environment variable contains paths to the following files(in that order)
-to a nonexisting file
-to a shared library that does not have a ompt_start_tool function
-to a shared library that has an ompt_start_tool implementation returning NULL
-to a shared library that has an ompt_start_tool implementation returning a
pointer to a valid instance of ompt_start_tool_result_t
The expected result is that the last tool gets active and can print in the
thread-begin callback.
Differential Revision: https://reviews.llvm.org/D42166
llvm-svn: 324588
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a testcase that checks wheter the runtime can handle an ompt_start_tool
method that returns NULL indicating that no tool shall be loaded.
All tool_available testcases need a separate folder to avoid file conflicts for
the generated tools.
Differential Revision: https://reviews.llvm.org/D41904
llvm-svn: 324587
|
|
|
|
|
|
|
|
|
| |
If tool initialization returns 0, OMPT should not be active. The current
implementation provided some callback invocations in this case.
Differential Revision: https://reviews.llvm.org/D42709
llvm-svn: 324320
|
|
|
|
|
|
|
|
|
|
|
| |
Use fuzzy return addresses in lock testcases so that these
testcases can also be run using the Intel Compiler.
Patch by Simon Convent!
Differential Revision: https://reviews.llvm.org/D41896
llvm-svn: 323529
|
|
|
|
|
|
| |
The previous commit did not revert all replaced ompt_mutex_impl_unknown.
llvm-svn: 322631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add Workaround for Intel Compiler Bug with Case#: 03138964
A critical region within a nested task causes a segfault in icc 14-18:
int main()
{
#pragma omp parallel num_threads(2)
#pragma omp master
#pragma omp task
#pragma omp task
#pragma omp critical
printf("test\n");
}
When the critical region is in a separate function, the segault does not occur.
So we add noinline to make sure that the function call stays there.
Differential Revision: https://reviews.llvm.org/D41182
llvm-svn: 322622
|
|
|
|
|
|
|
|
|
|
| |
The defintion is not part of the spec and thus should not have the prefix
"ompt_" but rather a prefix that indicates that this is implementation
specific.
Differential Revision: https://reviews.llvm.org/D41166
llvm-svn: 322621
|
|
|
|
|
|
|
|
|
|
|
| |
threads
When the current thread is not an (initialized) OpenMP thread, the runtime
entry points return values that correspond to "not available" or similar
Differential Revision: https://reviews.llvm.org/D41167
llvm-svn: 322620
|
|
|
|
|
|
|
|
| |
Patch by simone <simone@cs.utah.edu>.
Differential Revision: https://reviews.llvm.org/D41945
llvm-svn: 322282
|
|
|
|
|
|
|
|
|
|
|
|
| |
If user requested affinity with granularity=tile we need to either use HWLOC
or ignore the request. The change allows user to not specify
KMP_TOPOLOGY_METHOD=hwloc and choose it automatically instead.
Patch by Andrey Churbanov
Differential Revision: https://reviews.llvm.org/D40905
llvm-svn: 322205
|