| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The comment already states, that this function should work similarly as __ompt_get_taskinfo.
The function only looked for lwt entries of the current team, but not when unrolling the parents. This fix aligns the implementation to __ompt_get_taskinfo.
The new test case creates a single theaded team (->lwt) and then a nested active team.
Before the innermost print_id(1) would deliver a different team then the outer print_id(0).
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23309
llvm-svn: 281466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The exit address is set when execution of a task is started and should be reset as soon as the execution is finished.
Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation.
The testcase shows the effect of this patch:
Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task.
This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23307
llvm-svn: 281465
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
version of OMPT spec
The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops.
Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for :
Since there is no runtime frame between the executed task and the encountering task.
The test case compares exit and reenter addresses against addresses captured in application code
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23305
llvm-svn: 281464
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OMPT tests can check for right frame information of tasks:
* parent_task_frame was directly printed as a pointer, but actually points to a struct ompt_frame {void*, void*}
* NULL is printed in the beginning of execution and loaded to FileChecker variable [[NULL]]
* implicit tasks now also print their frame information
* macro to print frame address from application
* print task info for barrier begin
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23304
llvm-svn: 281463
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we
use the get_max_proc() function which can vary based on the operating system.
For example on Windows with multiple processor groups, it might be the case that
the highest bit possible in the bitmask is not equal to the number of hardware
threads on the machine but something higher than that.
Differential Revision: https://reviews.llvm.org/D24206
llvm-svn: 281245
|
|
|
|
|
|
|
|
| |
There is a bug in CMakeLists which causes powerpc64le systems to be recognized as big-endian. This patch fixes the issue.
Differential Revision: https://reviews.llvm.org/D23626
llvm-svn: 281068
|
|
|
|
|
|
|
|
|
| |
Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device.
Also, added support for the environment variable OMP_DEFAULT_DEVICE.
Differential Revision: https://reviews.llvm.org/D23587
llvm-svn: 281065
|
|
|
|
|
|
|
|
|
|
|
|
| |
When affinity isn't supported, __kmp_affinity_compact doesn't exist. The
problem is that in kmp_affinity.h there is a function which uses it without the
proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to
ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on
it, but I think it is cleaner to have it under the proper guard. Since the
function is only used in the kmp_affinity.cpp file and there aren't any plans to
have it elsewhere. I have moved it there.
llvm-svn: 280542
|
|
|
|
|
|
|
| |
the __kmp_affinity_determine_capable() functions are highly operating system
specific. This change has the functions use the type they expect explicitly.
llvm-svn: 280538
|
|
|
|
| |
llvm-svn: 280530
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case atomic reduction method is not available (the compiler can't generate
it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified.
This change replaces the assertion with a warning and sets the reduction method
to the default one - 'critical'.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D23990
llvm-svn: 280519
|
|
|
|
|
|
|
| |
Older gcc compilers error out with the C99 syntax of: for (int i =...)
so this change just moves the int i; declaration up above.
llvm-svn: 280138
|
|
|
|
|
|
| |
no functional changes done
llvm-svn: 278951
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D23175
llvm-svn: 278332
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
On FreeBSD, linking the misc_bugs/omp_foreign_thread_team_reuse.c test
case fails with:
/usr/local/bin/ld: /tmp/omp_foreign_thread_team_reuse-c5e71b.o: undefined reference to symbol 'pthread_create@@FBSD_1.0'
This is because the program is linked without `-lpthread`. Since the
%libomp-compile-and-run macro does not allow that option to be added to
the compile command line, split it up and add the required `-lpthread`
between %libomp-compile and %libomp-run.
Reviewers: jlpeyton, hfinkel, Hahnfeld
Subscribers: Hahnfeld, emaste, openmp-commits
Differential Revision: https://reviews.llvm.org/D23084
llvm-svn: 278036
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D23259
llvm-svn: 278003
|
|
|
|
| |
llvm-svn: 277996
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following code:
int dep;
#pragma omp target nowait depend(out: dep)
{
sleep(1);
}
#pragma omp task depend(in: dep)
{
printf("Task with dependency\n");
}
printf("Doing some work...\n");
In its current state the runtime will block on the second task and not
continue execution.
Differential Revision: https://reviews.llvm.org/D23116
llvm-svn: 277992
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following code which may be executed by a serial team:
int dep;
#pragma omp target nowait depend(out: dep)
{
sleep(1);
}
#pragma omp task depend(in: dep)
{
#pragma omp target nowait
{
sleep(1);
}
}
Here the explicit task may not be freed until the nested proxy task has
finished. The current code hasn't considered this and called __kmp_free_task
anyway which triggered an assert because of remaining incomplete children:
KMP_DEBUG_ASSERT( TCR_4(taskdata->td_incomplete_child_tasks) == 0 );
Differential Revision: https://reviews.llvm.org/D23115
llvm-svn: 277991
|
|
|
|
|
|
|
|
| |
Mask for value read from ebx register returned by CPUID expanded to 0xFFFF.
Differential Revision: https://reviews.llvm.org/D23203
llvm-svn: 277825
|
|
|
|
|
|
| |
For discussion in D23115
llvm-svn: 277730
|
|
|
|
|
|
|
|
| |
node->dn.task is only filled after the dependencies are already processed.
This currently leads to unhelpful output from KA_TRACE or even a crash
if one enables KMP_SUPPORT_GRAPH_OUTPUT.
llvm-svn: 277717
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Android does not have pthread_cancel. Disable KMP_CANCEL_THREADS if
__ANDROID__ is defined.
Subscribers: tberghammer, srhines, openmp-commits, danalbert
Differential Revision: https://reviews.llvm.org/D23029
llvm-svn: 277618
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables balanced affinity on machines that do not have
hardware threads and have cores clustered into packages. In facts,
balacing algorithm could be generalized for any arrangement with
at least two levels of hierarchy (depth > 1).
Differential Revision: https://reviews.llvm.org/D22365
llvm-svn: 277212
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When compiling the runtime library with clang we get warnings like:
```
error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs]
va_start( args, id );
^
note: parameter of type 'kmp_i18n_id_t' (aka 'kmp_i18n_id') is declared here
kmp_i18n_id_t id,
```
My understanding is that the va_start macro only gets the promoted type so it won't know what was the exact type of the argument, which can potentially not work for some targets given that the implementation of the the calling convention could not be done properly.
This patch fixes that by using a built-in type in the function signature.
Reviewers: tlwilmar, jlpeyton, AndreyChurbanov
Subscribers: arpith-jacob, carlo.bertolli, caomhin, openmp-commits
Differential Revision: https://reviews.llvm.org/D22427
llvm-svn: 276428
|
|
|
|
|
|
| |
schedule modifier
llvm-svn: 275052
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When linking with libhwloc, the ORDERED EPCC test slows down on big
machines (> 48 cores). Performance analysis showed that a cache thrash
was occurring and this padding helps alleviate the problem.
Also, inside the main spin-wait loop in kmp_wait_release.h, we can eliminate
the references to the global shared variables by instead creating a local
variable, oversubscribed and instead checking that.
Differential Revision: http://reviews.llvm.org/D22093
llvm-svn: 274894
|
|
|
|
| |
llvm-svn: 274854
|
|
|
|
|
|
| |
hot teams info
llvm-svn: 274851
|
|
|
|
| |
llvm-svn: 274850
|
|
|
|
| |
llvm-svn: 274849
|
|
|
|
|
|
|
|
|
|
| |
These tests are now modeled after the sections nowait test where threads wait
to be released in the first construct (either for or single) and the last thread
skips the last for/single construct and releases those threads. If the test
fails, then it hangs because an unnecessary barrier is executed in between the
constructs.
llvm-svn: 274641
|
|
|
|
|
|
|
|
|
|
|
|
| |
If update_master_only is set the place list is not completely traversed
and therefore this assertion failed. Make it only trigger if
update_master_only is false.
(was introduced by D20539)
Differential Revision: http://reviews.llvm.org/D21925
llvm-svn: 274482
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes an error in comparing the existing schedule on the team to
the new schedule, in the chunk field. Also added additional checks and used
KMP_CHECK_UPDATE where appropriate.
Patch by Terry Wilmarth.
Differential Revision: http://reviews.llvm.org/D21897
llvm-svn: 274371
|
|
|
|
|
|
|
|
|
|
|
|
| |
EPCC Performance of single is considerably worse than plain barrier.
Adding a read-only check to the code before the atomic compare-and-store
helps considerably.
Patch by Terry Wilmarth.
Differential Revision: http://reviews.llvm.org/D21893
llvm-svn: 274369
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rewrite of the omp_sections_nowait.c test file causes it to hang if the
nowait is not respected. If the nowait isn't respected, the lone thread which
can escape the first sections construct will just sleep at a barrier which
shouldn't exist. All reliance on timers is taken out. For good measure, the test
makes sure that all eight sections are executed as well. The test should take no
longer than a few seconds on any modern machine.
Differential Revision: http://reviews.llvm.org/D21842
llvm-svn: 274151
|
|
|
|
|
|
|
|
|
| |
* Incorrect lock value written in __kmp_test_futex_lock
* Incorrect lock value check in tas/futex lock with USE_LOCK_PROFILE on
Patch by Hansang Bae
llvm-svn: 274053
|
|
|
|
|
|
|
|
|
|
|
| |
UNICODE and _UNICODE defintions were added in the LLVM CMake build system.
While on Unices, the UNICODE/_UNICODE macros don't cause problems, on Windows
only ittnotify_static.c should be compiled using -DUNICODE. We are still
looking at a proper fix, but this change sets the build back to exactly what it
was doing before. Also, a comment and TODO were added in the src/CMakeLists.txt
file to help explain.
llvm-svn: 274052
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That patch made all LLVM projects build with -DUNICODE. However, this doesn't
work for the OpenMP runtime.
But just overriding the flag with -UUNICODE breaks compiling ittnotify_static.c,
which for some reason needs to be compiled with -DUNICIODE. Note that compiling
ittnotify.h with -DUNICODE does not work though.
This seems like a mess. This commit fixes it for now, but it would be great
if someone who works on the OpenMP runtime could fix it properly.
llvm-svn: 273898
|
|
|
|
| |
llvm-svn: 273576
|
|
|
|
| |
llvm-svn: 273439
|
|
|
|
| |
llvm-svn: 273438
|
|
|
|
| |
llvm-svn: 273299
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix for hang when omp task and nested parallelism used together.
Still some problem remains with task state saving/restoring, but
user's case works fine now. All tasking unit tests passed as well.
Patch by Andrey Churbanov
Differential Revision: http://reviews.llvm.org/D21558
llvm-svn: 273297
|
|
|
|
|
|
|
|
|
|
|
| |
Replaced readings of nproc from team structure with ones from
thread structure to improve performance.
Patch by Andrey Churbanov.
Differential Revision: http://reviews.llvm.org/D21559
llvm-svn: 273293
|
|
|
|
|
|
|
|
|
|
|
|
| |
The removal of legacy code to support long-deprecated debugger support library
resulted in some whitespace changes. Comments from that legacy code were made
public as they may be useful for other debuggers.
Patch by Olga Malysheva.
Differential Revision: http://reviews.llvm.org/D21391
llvm-svn: 273282
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple improvements:
1) Add ability to limit fullMask size when KMP_HW_SUBSET limits resources.
2) Make KMP_HW_SUBSET work for affinity_none, and only limit fullMask in this case.
Patch by Andrey Churbanov.
Differential Revision: http://reviews.llvm.org/D21528
llvm-svn: 273278
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a segfault in the stubs library in posix_memalign because
of a bad parameter. The fix is to send address of the pointer as a
parameter. Also added check of result of posix_memalign.
Patch by Andrey Churbanov.
Differential Revision: http://reviews.llvm.org/D21529
llvm-svn: 273276
|
|
|
|
|
|
|
|
|
| |
This change appends the process id to the KMP_STATS_FILE (if specified) which
enables MPI processes to output their stats to separate files.
Differential Revision: http://reviews.llvm.org/D21386
llvm-svn: 273273
|
|
|
|
|
|
|
|
| |
Fix typos in Fortran headers to match spec.
Patch by Andrey Churbanov.
Differential Revision: http://reviews.llvm.org/D21531
llvm-svn: 273272
|