| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D26688
llvm-svn: 289732
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Implemented by Dejan Latinovic
See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790735 for more more information
Reviewers: AndreyChurbanov, jlpeyton
Subscribers: openmp-commits, mgorny
Differential Revision: https://reviews.llvm.org/D26576
llvm-svn: 289032
|
|
|
|
|
|
|
|
|
|
| |
Paul Osmialowski pointed out a double free bug in shutdown code. This patch
Moves the freeing of the implicit task to above the freeing of all fast memory
to prevent the double-free issue.
Differential Revision: https://reviews.llvm.org/D26860
llvm-svn: 287551
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Have developer timers use partitioning scheme which also required that some
redundant developer timers be removed in favor of the already existing normal
timers. Move per thread stats initialization to just after global thread id
assignment which is as early as possible. Also put all global stats
initialization code in __kmp_stats_init() and all global stats destruction code
in __kmp_stats_fini().
Differential Revision: https://reviews.llvm.org/D26361
llvm-svn: 286892
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This set of changes enables the affinity interface (Either the preexisting
native operating system or HWLOC) to be dynamically set at runtime
initialization. The point of this change is that we were seeing performance
degradations when using HWLOC. This allows the user to use the old affinity
mechanisms which on large machines (>64 cores) makes a large difference in
initialization time.
These changes mostly move affinity code under a small class hierarchy:
KMPAffinity
class Mask {}
KMPNativeAffinity : public KMPAffinity
class Mask : public KMPAffinity::Mask
KMPHwlocAffinity
class Mask : public KMPAffinity::Mask
Since all interface functions (for both affinity and the mask implementation)
are virtual, the implementation can be chosen at runtime initialization.
Differential Revision: https://reviews.llvm.org/D26356
llvm-svn: 286890
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows ThreadSanitizer (Tsan) to verify OpenMP programs.
It means that no false positive will be reported by Tsan when
verifying an OpenMP programs.
This patch introduces annotations within the OpenMP runtime module to
provide information about thread synchronization to the Tsan runtime.
In order to enable the Tsan support when building the runtime, you must
enable the TSAN_SUPPORT option with the following environment variable:
-DLIBOMP_TSAN_SUPPORT=TRUE
The annotations will be enabled in the main shared library
(same mechanism of OMPT).
Patch by Simone Atzeni and Joachim Protze!
Differential Revision: https://reviews.llvm.org/D13072
llvm-svn: 286115
|
|
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D25504
Patch by Alex Duran.
llvm-svn: 285283
|
|
|
|
|
|
|
|
| |
Patch by Victor Campos
Differential Revision: https://reviews.llvm.org/D26001
llvm-svn: 285243
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If directives are used in a macro, clang complains with:
```
src/projects/openmp/runtime/src/kmp_runtime.c:7486:2: error: embedding a directive within macro arguments has undefined behavior [-Werror,-Wembedded-directive]
#if KMP_USE_MONITOR
```
This patch fixes two occurrences of the issue in `kmp_runtime.cpp`.
Reviewers: tlwilmar, jlpeyton, AndreyChurbanov, Hahnfeld
Subscribers: Hahnfeld, openmp-commits
Differential Revision: https://reviews.llvm.org/D25823
llvm-svn: 284728
|
|
|
|
|
|
|
|
|
|
| |
This change removes/disables unnecessary code when monitor thread is not used.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D25102
llvm-svn: 283577
|
|
|
|
|
|
|
| |
As the code is now, calling omp_get_schedule() when OMP_SCHEDULE=static_steal
will cause an assert.
llvm-svn: 283576
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add check for "45" version to use "201511" string for OpenMP 4.5,
otherwise "200505" is used in Fortran module. Also, fix kmp_openmp_version
variable (used for the debugger, e.g.) and kmp_version_omp_api that is used
in KMP_VERSION=1 output.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D24761
llvm-svn: 282868
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change set disables creation of the monitor thread by default. The global
counter maintained by the monitor thread was replaced by logic that uses system
time directly, and cyclic yielding on Linux target was also removed since there
was no clear benefit of using it. Turning on KMP_USE_MONITOR variable (=1)
enables creation of monitor thread again if it is really necessary for some
reasons.
Differential Revision: https://reviews.llvm.org/D24739
llvm-svn: 282507
|
|
|
|
|
|
|
|
|
|
|
| |
Previous differencials D23305-D23310 changed task frame information management only for the kmp interface, but not for the whole gomp interface. This broke some testcases when building with gcc.
This patch fixes the broken task frame information for the gomp interface.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D24502
llvm-svn: 281468
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The exit address is set when execution of a task is started and should be reset as soon as the execution is finished.
Especially for the asm implementation of __kmp_invoke_microtask, resetting in this call would be painfull, so reset just after the invokation.
The testcase shows the effect of this patch:
Before, the implicit barriers at the end of an implicit task would see an exit address for the implicit task.
This barrier is a task scheduling point. Thus, any explicit task scheduled there would see an exit, but no reenter address for the implicit task.
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23307
llvm-svn: 281465
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
version of OMPT spec
The latest OMPT spec changed the semantic of a tasks reenter frame to be the application frame, that will be entered, when the runtime frame drops.
Before it was the last frame in the runtime. This doesn't work for some gcc execution pathes or even clang generated code for :
Since there is no runtime frame between the executed task and the encountering task.
The test case compares exit and reenter addresses against addresses captured in application code
Patch by Joachim Protze!
Differential Revision: https://reviews.llvm.org/D23305
llvm-svn: 281464
|
|
|
|
|
|
|
|
|
| |
Implementation of missing OpenMP 4.0 API functions omp_get_default_device and omp_set_default_device.
Also, added support for the environment variable OMP_DEFAULT_DEVICE.
Differential Revision: https://reviews.llvm.org/D23587
llvm-svn: 281065
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case atomic reduction method is not available (the compiler can't generate
it) the assertion failure occurred if KMP_FORCE_REDUCTION=atomic was specified.
This change replaces the assertion with a warning and sets the reduction method
to the default one - 'critical'.
Patch by Olga Malysheva
Differential Revision: https://reviews.llvm.org/D23990
llvm-svn: 280519
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D23175
llvm-svn: 278332
|
|
|
|
|
|
| |
hot teams info
llvm-svn: 274851
|
|
|
|
|
|
|
|
|
|
|
|
| |
If update_master_only is set the place list is not completely traversed
and therefore this assertion failed. Make it only trigger if
update_master_only is false.
(was introduced by D20539)
Differential Revision: http://reviews.llvm.org/D21925
llvm-svn: 274482
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes an error in comparing the existing schedule on the team to
the new schedule, in the chunk field. Also added additional checks and used
KMP_CHECK_UPDATE where appropriate.
Patch by Terry Wilmarth.
Differential Revision: http://reviews.llvm.org/D21897
llvm-svn: 274371
|
|
|
|
|
|
|
|
|
|
|
|
| |
EPCC Performance of single is considerably worse than plain barrier.
Adding a read-only check to the code before the atomic compare-and-store
helps considerably.
Patch by Terry Wilmarth.
Differential Revision: http://reviews.llvm.org/D21893
llvm-svn: 274369
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug fix for hang when omp task and nested parallelism used together.
Still some problem remains with task state saving/restoring, but
user's case works fine now. All tasking unit tests passed as well.
Patch by Andrey Churbanov
Differential Revision: http://reviews.llvm.org/D21558
llvm-svn: 273297
|
|
|
|
|
|
|
|
|
|
|
|
| |
The removal of legacy code to support long-deprecated debugger support library
resulted in some whitespace changes. Comments from that legacy code were made
public as they may be useful for other debuggers.
Patch by Olga Malysheva.
Differential Revision: http://reviews.llvm.org/D21391
llvm-svn: 273282
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added argv array check/allocation for parallel directly nested inside the teams
construct, as new coming Fortran codegen passes parameters directly into
kmpc_fork_call missing same parameters in kmpc_fork_teams (earlier codegen
passed to parallel the subset of parameter passed to teams, and thus
no check/allocation needed).
Patch by Andrey Churbanov
Differential Revision: http://reviews.llvm.org/D21336
llvm-svn: 272935
|
|
|
|
|
|
|
|
| |
OpenMP 4.1 is now OpenMP 4.5. Any mention of 41 or 4.1 is replaced with
45 or 4.5. Also, if the CMake option LIBOMP_OMP_VERSION is 41, CMake warns that
41 is deprecated and to use 45 instead.
llvm-svn: 272687
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove static specifier from var fullMask and remove kmp_get_fullMask() routine.
When iterating through procs in a mask, always check if proc is in fullMask
(this check was missing in a few places).
Patch by Brian Bliss.
Differential Revision: http://reviews.llvm.org/D21300
llvm-svn: 272589
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem is the lack of dispatch buffers when thousands of loops with nowait,
about 10 iterations each, are executed by hundreds of threads. We only have
built-in 7 dispatch buffers, but there is a need in dozens or hundreds of
buffers.
The problem can be fixed by setting KMP_MAX_DISP_BUF to bigger value. In order
to give users same possibility I changed build-time control into run-time one,
adding API just in case.
This change adds an environment variable KMP_DISP_NUM_BUFFERS and a new API
function kmp_set_disp_num_buffers(int num_buffers).
The KMP_DISP_NUM_BUFFERS envirable works only before serial initialization,
because during the serial initialization we already allocate buffers for the hot
team, so it is too late to change the number of buffers later (or we need to
reallocate buffers for all teams which sounds too complicated). The
kmp_set_defaults() routine does not work for this envirable, because it calls
serial initialization before reading the parameter string. So a new routine,
kmp_set_disp_num_buffers(), is created so that it can set our internal global
variable before the library initialization. If both the envirable and API used
the envirable wins.
Differential Revision: http://reviews.llvm.org/D20697
llvm-svn: 271318
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OMP_PROC_BIND=spread strategy fails to assign the master thread the
correct place partition after the first parallel region. Other threads in the
hot team will remember their place_partition, but the master's place partition
is restored to what it was before entering the parallel region. So when the hot
team is used for subsequent parallel regions, the master has lost this info.
This fix calls __kmp_partition_places to update only the master thread's place
partition in the spread case when there are no other changes to the hot team.
Patch by Terry Wilmarth
Differential Revision: http://reviews.llvm.org/D20539
llvm-svn: 270890
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Blue Gene/Q, having LIBOMP_USE_ITT_NOTIFY support compiled into a
statically-linked binary causes a failure at runtime because dlopen fails.
This patch changes LIBOMP_USE_ITT_NOTIFY to a cacheable configuration setting
that can be disabled.
Patch by John Mellor-Crummey
Differential Revision: http://reviews.llvm.org/D20517
llvm-svn: 270884
|
|
|
|
|
|
|
|
|
|
|
| |
Most of this is modifications to check for differences before updating data
fields in team struct. There is also some rearrangement of the team struct.
Patch by Diego Caballero
Differential Revision: http://reviews.llvm.org/D20487
llvm-svn: 270468
|
|
|
|
|
|
| |
This patch doesn't affect D19878's context. So D19878 still cleanly applies.
llvm-svn: 270252
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After hot teams were enabled by default, the library started using levels kept
in the team structure. The levels are broken in case foreign thread exits and
puts its team into the pool which is then re-used by another foreign thread.
The broken behavior observed is when printing the levels for each new team, one
gets 1, 2, 1, 2, 1, 2, etc. This makes the library believe that every other
team is nested which is incorrect. What is wanted is for the levels to be
1, 1, 1, etc.
Differential Revision: http://reviews.llvm.org/D19980
llvm-svn: 269363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes the current timers with ones that partition time properly.
The current timers are nested, so that if a new timer, B, starts when the
current timer, A, is already timing, A's time will include B's. To eliminate
this problem, the partitioned timers are designed to stop the current timer (A),
let the new timer run (B), and when the new timer is finished, restart the
previously running timer (A). With this partitioning of time, a threads' timers
all sum up to the OMP_worker_thread_life time and can now easily show the
percentage of time a thread is spending in different parts of the runtime or
user code.
There is also a new state variable associated with each thread which tells where
it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if
time is spent in OMP_task_taskwait, then that thread executed tasks inside a
#pragma omp taskwait construct.
The changes are mostly changing the MACROs to use the new PARITIONED_* macros,
the new partitionedTimers class and its methods, and new state logic.
Differential Revision: http://reviews.llvm.org/D19229
llvm-svn: 268640
|
|
|
|
| |
llvm-svn: 266760
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some codes that use TLS fail intermittently because one thread tries to write
TLS values after the TLS key has been destroyed by another thread. This happens
when one thread executes library shutdown (and destroys TLS keys), while another
thread starts to execute the TLS key destructor routine. Before this change, the
kmp_init_runtime flag was checked before calling pthread_* TLS functions, but
this flag is set to FALSE later than the destruction of the TLS keys, which
leads to failure. The fix is to check kmp_init_gtid instead, as this flag is
unset *before* the destruction of TLS keys.
Differential Revision: http://reviews.llvm.org/D19022
llvm-svn: 266674
|
|
|
|
| |
llvm-svn: 264776
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the standard: A doacross loop nest is a loop nest that has cross-iteration
dependence. An iteration is dependent on one or more lexicographically earlier
iterations. The ordered clause parameter on a loop directive identifies the
loop(s) associated with the doacross loop nest.
The init/fini routines allocate/free doacross buffer(s) for each loop for each
thread. The wait routine waits for a flag designated by the dependence vector.
The post routine sets the flag designated by current iteration vector. We use
a similar technique of shared buffer indices that covers up to 7 nowait loops
executed simultaneously by different threads (number 7 has no real meaning,
just heuristic value). Also, the size of structures are kept intact via
reducing dummy arrays.
This needs to be put into the OpenMP runtime library in order for the compiler
team to develop the compiler side of the implementation.
Differential Revision: http://reviews.llvm.org/D17399
llvm-svn: 262532
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change introduces the new OpenMP 4.5 affinity api surrounding
OpenMP Places. There are six new entry points:
Typically called in serial region:
* omp_get_num_places - returns the number of places available to the execution
environment in the place list.
* omp_get_place_num_procs - returns the number of processors available to the
execution environment in the specified place.
* omp_get_place_proc_ids - returns the numerical identifiers of the processors
available to the execution environment in the specified place.
Typically called inside parallel region:
* omp_get_place_num - returns the place number of the place to which the
encountering thread is bound.
* omp_get_partition_num_places - returns the number of places in the place
partition of the innermost implicit task.
* omp_get_partition_place_nums - returns the list of place numbers
corresponding to the places in the place-var ICV of the innermost
implicit task.
Differential Revision: http://reviews.llvm.org/D17417
llvm-svn: 261915
|
|
|
|
|
|
|
| |
The problem is that the master's thread state was not saved before entering a
parallel region so it does not remember tasks when it returns.
llvm-svn: 260306
|
|
|
|
|
|
|
|
|
|
| |
When the code behind the barrier is executed, the master thread may have
already resumed execution. That's why we cannot safely assume that *pteam
is not yet freed.
This has been introduced by r258866.
llvm-svn: 259037
|
|
|
|
|
|
|
| |
Removing extraneous { } bracket sections. Unindenting blocks of
code as a result. Also removing empty #ifdef KMP_STUB
llvm-svn: 258986
|
|
|
|
|
|
| |
Removing references to non-existent functions, fixing typos.
llvm-svn: 258985
|
|
|
|
| |
llvm-svn: 258984
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For implcit barriers in simple parallel for loops, the order of the OMPT events
was wrong. The barrier_{begin,end} events came after the implcit_task_end
event for the implcit barrier at the end of the parallel region. This is wrong
because the implicit task executes the barrier before ending. This patch fixes
the order of the event: It will be triggerd now just before
__kmp_pop_current_task_from_thread() is called.
Patch by Tim Cramer
Differential Revision: http://reviews.llvm.org/D16347
llvm-svn: 258866
|
|
|
|
|
|
| |
Change (__kmp_mic_type != non_mic) to (__kmp_mic_type == mic2)
llvm-svn: 257380
|
|
|
|
| |
llvm-svn: 255901
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix for crash in the teams construct in case user sets OMP_THREAD_LIMIT to a
number less than the number of processors. Now the number of threads will be
silently reduced if the user didn't specify teams parameters or with a
warning if the user specified teams parameters conflicting with
OMP_THREAD_LIMIT.
Differential Revision: http://reviews.llvm.org/D14732
llvm-svn: 254322
|
|
|
|
| |
llvm-svn: 253264
|