| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: jdoerfert, Jim
Reviewed By: Jim
Subscribers: Jim, mgorny, guansong, jfb, openmp-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D72285
|
|
|
|
|
|
| |
Submitted by: kiszk
Differential Revision: https://reviews.llvm.org/D72171
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The termination function duplicated the functionality of the
__attribute((destructor))-annotated function __kmp_internal_end_fini,
and we have no indication that this doesn't work.
The function might cause issues with link-time optimization turned on:
until very recently, none of the usual linkers was reporting functions
named in -Wl,-fini as used to the LTO plugin, so it might be dropped.
If the function is dropped, -Wl,-fini=__kmp_internal_end_fini doesn't
do what we want: with ld.bfd and lld it drops the FINI attribute from
.dynamic and with gold we get FINI = 0x0, which leads to a crash on
cleanup. This can be reproduced by building with
-DLLVM_ENABLE_PROJECTS="clang;openmp" \
-DLLVM_ENABLE_LTO=Thin \
-DLLVM_USE_LINKER=gold
The issue in lld has been fixed in f95273f75aa, but gold remains without
fix so far.
Fixes PR43927.
Reviewers: JonChesterfield, jdoerfert, AndreyChurbanov
Reviewed By: AndreyChurbanov
Differential Revision: https://reviews.llvm.org/D69927
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: chandlerc, jlpeyton, jdoerfert, dim
Reviewed-By: dim
Differential Revision: https://reviews.llvm.org/D68580
llvm-svn: 374118
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes https://bugs.llvm.org/show_bug.cgi?id=42906, via adding
adjustment of number of threads on enter to the teams construct on host
according to user settings. This allows to pass checks and avoid assertions
at time of team of threads creation.
Patch by Andrey Churbanov
Differential Revision: https://reviews.llvm.org/D66351
llvm-svn: 369430
|
|
|
|
|
|
|
|
|
|
| |
The variables in kmp_lock.cpp are really arrays of function pointers
that return void or int, not pointers to functions that return void*
or int*. The other changes are only cosmetic.
Differential Revision: https://reviews.llvm.org/D65870
llvm-svn: 369002
|
|
|
|
|
|
|
|
| |
This change adds OMPT support for events from teams construct.
Differential Revision: https://reviews.llvm.org/D64025
llvm-svn: 367746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a port of libomp for the RISC-V 64-bit Linux target.
We have tested this port on a HiFive Unleashed development board
using a downstream LLVM that has support for the missing bits in
upstream. As of now, all tests are passing, including OMPT.
Patch by Ferran Pallarès!
Differential Revision: https://reviews.llvm.org/D59880
llvm-svn: 367021
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is done at call-site and does not need to be handled in
__kmp_invoke_microtask. It was already absent from the x86
and x86_64 assembly, this patch removes it from the generic
implementation in z_Linux_util.cpp and adds documentation for
AArch64 and PPC64 that it's actually not needed. I can't test
on these architectures, so I don't want to change the code just
because it looks right :)
While at it, rename some variables for consistency and add a
check in test/ompt/parallel/normal.c that the pointer was reset
before entering the barrier.
Differential Revision: https://reviews.llvm.org/D64442
llvm-svn: 366721
|
|
|
|
|
|
|
|
|
|
| |
Remove all older OMP spec versioning from the runtime and build system.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D64534
llvm-svn: 365963
|
|
|
|
|
|
|
|
|
|
|
| |
Bug reported in https://bugs.llvm.org/show_bug.cgi?id=42269.
Freeing of the contention group (CG) stucture by master thread looks wrong,
because workers can leave the CG later on. Intead the freeing
is now done by the last thread leaving the CG.
Differential Revision: https://reviews.llvm.org/D63599
llvm-svn: 364456
|
|
|
|
|
|
|
|
| |
Removed wrong debug assertion.
Differential Revision: https://reviews.llvm.org/D62251
llvm-svn: 361408
|
|
|
|
|
|
|
|
|
|
|
| |
OpenMP 5.0 says that the callback for the events initial-task-begin and
initial-task-end has to be ompt_callback_implicit_task.
Patch by Tim Cramer
Differential Revision: https://reviews.llvm.org/D58776
llvm-svn: 361157
|
|
|
|
|
|
|
|
|
| |
Removed unconditional and unsafe decrement of counter
of active threads in pool at shutdown time.
Differential Revision: https://reviews.llvm.org/D61944
llvm-svn: 360784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
To be able to successfully build OpenMP on 32-bit FreeBSD, such as
FreeBSD/i386, I first had to provide a few wrappers (see D60916), and
then add `KMP_OS_FREEBSD` to the list of defines checked for 32-bit
architectures in `kmp_runtime.cpp`.
I have successfully built libomp.so and ran a bunch of test programs on
FreeBSD/i386 with this.
See also https://svnweb.freebsd.org/changeset/base/345283
Reviewers: emaste, jlpeyton, Hahnfeld
Reviewed By: jlpeyton
Subscribers: krytarowski, guansong, jdoerfert, openmp-commits, llvm-commits
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D60917
llvm-svn: 359716
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds:
* New omp_sched_monotonic flag to omp_sched_t which is handled within the runtime
* Parsing of monotonic/nonmonotonic in OMP_SCHEDULE
* Tests for the monotonic flag and envirable parsing
* Logic to force monotonic when hierarchical scheduling is used
Differential Revision: https://reviews.llvm.org/D60979
llvm-svn: 359601
|
|
|
|
|
|
|
|
|
|
| |
https://bugs.llvm.org/show_bug.cgi?id=41494
Freed th_cg_roots structure at exit from uber thread.
Differential Revision: https://reviews.llvm.org/D60729
llvm-svn: 358572
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch cleans up the bookkeeping code for the load balancing dynamic mode.
When a thread is moved to or from the thread pool, the th_active_in_pool flag
and the __kmp_thread_pool_active_nth global counter are both updated. This
removes the need for the corrective code in the main wait loop. Another global
counter, __kmp_thread_pool_nth, was removed completely, as it was only used for
debugging, but was not under KMP_DEBUG.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D59508
llvm-svn: 357927
|
|
|
|
|
|
|
|
|
|
|
| |
The distribute clause needs an explicit push of a timer. The teams
clause needs a timer added and also, similarly to parallel, exchanged
with the serial timer when encountered so that serial regions are
counted properly.
Differential Revision: https://reviews.llvm.org/D59801
llvm-svn: 357621
|
|
|
|
|
|
|
|
|
|
| |
Add 5.0 guard to pause code for now.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D59428
llvm-svn: 356933
|
|
|
|
|
|
|
|
|
|
| |
Remove very old, unused, and deprecated taskq code.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58989
llvm-svn: 356288
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change makes the runtime decide the intended use of each barrier
invocation, for the OMPT synchronization tool callbacks. The OpenMP 5.0
specification defines four possible barrier kinds -- implicit, explicit,
implementation, and just normal barrier.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D58247
llvm-svn: 355140
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nest-var, OMP_NESTED, omp_set_nested()., and omp_get_nested() have been
deprecated in the 5.0 spec. Initial nesting info is now derived from
OMP_MAX_ACTIVE_LEVELS, OMP_NUM_THREADS, and OMP_PROC_BIND.
This patch deprecates the internal ICV that corresponds to nest-var, and
replaces it with the max-active-levels-var ICV to determine nesting. The
change still allows for use of OMP_NESTED (according to 5.0 changes),
omp_get_nested, and omp_set_nested, which have had deprecation messages
added to them. The change allows certain settings of OMP_NUM_THREADS,
OMP_PROC_BIND, and OMP_MAX_ACTIVE_LEVELS to turn on nesting, but
OMP_NESTED=0 will still force nesting to be off.
The runtime now prints informative messages about deprecation of
OMP_NESTED, omp_set_nested(), and omp_get_nested(), when those
environment variables or routines are used. It also prints deprecated
message in output for KMP_SETTINGS and OMP_DISPLAY_ENV for OMP_NESTED.
This patch also fixes OMP_DISPLAY_ENV output for OMP_TARGET_OFFLOAD.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58408
llvm-svn: 355138
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch cleans up the yielding code and makes it optional. An
environment variable, KMP_USE_YIELD, was added. Yielding is still
on by default (KMP_USE_YIELD=1), but can be turned off completely
(KMP_USE_YIELD=0), or turned on only when oversubscription is detected
(KMP_USE_YIELD=2). Note that oversubscription cannot always be detected
by the runtime (for example, when the runtime is initialized and the
process forks, oversubscription cannot be detected currently over
multiple instances of the runtime).
Because yielding can be controlled by user now, the library mode
settings (from KMP_LIBRARY) for throughput and turnaround have been
adjusted by altering blocktime, unless that was also explicitly set.
In the original code, there were a number of places where a double yield
might have been done under oversubscription. This version checks
oversubscription and if that's not going to yield, then it does
the spin check.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D58148
llvm-svn: 355120
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The thread-limit-var and omp_get_thread_limit API was not perfectly handled for
teams construct. Now, when modified by thread_limit clause, omp_get_thread_limit
reports the correct value. In addition, the value is restored when leaving the
teams construct to what it was in the encountering context.
This is done partly by creating the notion of a Contention Group root (CG root)
that keeps track of the thread at the root of each separate CG, the
thread-limit-var associated with the CG, and associated counter of active
threads within the contention group.
thread-limits are passed from master to worker threads via an entry in the ICV
data structure. When a "contention group switch" occurs, a new CG root record is
made and passed from master to worker. A thread could potentially have several
CG root records if it encounters multiple nested teams constructs (but at the
moment the spec doesn't allow for nested teams, so the most one could have
currently is 2). The master of the teams masters gets the thread-limit clause
value stored to its local ICV structure, and the other teams masters copy it
from the master. The thread-limit is set from that ICV copy and restored to the
ICV copy when entering and leaving the teams construct.
This change also fixes a bug when the top-level teams construct team gets
reused, and OMP_DYNAMIC was true, which can cause the expected size of this team
to be smaller than what was actually allocated. The fix updates the size of the
team after its threads were reserved.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D56804
llvm-svn: 353747
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The three switch fallthrough generate a warning with -Wimplicit-fallthrough.
Two are documented as fallthrough, one is not, but I think the intention is to also fallthrough in kmp_tasking.cpp.
Not sure whether kmp.h is the best place to define the macro.
Reviewers: jlpeyton, AndreyChurbanov, Hahnfeld
Reviewed By: jlpeyton
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D56397
llvm-svn: 353052
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reflect the new license. These used slightly different spellings that
defeated my regular expressions.
We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.
Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.
llvm-svn: 351648
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add omp_pause_resource and omp_pause_resource_all API and enum, plus stub for
internal implementation. Implemented callable helper function to do local pause,
and added basic functionality for hard and soft pause.
Patch by Terry Wilmarth
Differential Revision: https://reviews.llvm.org/D55078
llvm-svn: 351372
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The omp-tools.h file is generated from the OpenMP spec to ensure that the interface
is implemented as specified.
The other changes are necessary to update the interface implementation to the
final version as published in 5.0.
The omp-tools.h header was previously called ompt.h, currently a copy under this name
is installed for legacy tools.
Patch partially perpared by @sconvent
Reviewers: AndreyChurbanov, hbae, Hahnfeld
Reviewed By: hbae
Tags: #openmp, #ompt
Differential Revision: https://reviews.llvm.org/D55579
llvm-svn: 351197
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates the implementation of the ompt_frame_t, ompt_wait_id_t
and ompt_state_t. The final version of the OpenMP 5.0 spec added the "t"
for these types.
Furthermore the structure for ompt_frame_t changed and allows to specify
that the reenter frame belongs to the runtime.
Patch partially prepared by Simon Convent
Reviewers: hbae
llvm-svn: 349458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds the affinity format functionality introduced in OpenMP 5.0.
This patch adds: Two new environment variables:
OMP_DISPLAY_AFFINITY=TRUE|FALSE
OMP_AFFINITY_FORMAT=<string>
and Four new API:
1) omp_set_affinity_format()
2) omp_get_affinity_format()
3) omp_display_affinity()
4) omp_capture_affinity()
The affinity format functionality has two ICV's associated with it:
affinity-display-var (bool) and affinity-format-var (string).
The affinity-display-var enables/disables the functionality through the
envirable OMP_DISPLAY_AFFINITY. The affinity-format-var is a formatted
string with the special field types beginning with a '%' character
similar to printf
For example, the affinity-format-var could be:
"OMP: host:%H pid:%P OStid:%i num_threads:%N thread_num:%n affinity:{%A}"
The affinity-format-var is displayed by every thread implicitly at the beginning
of a parallel region when any thread's affinity has changed (including a brand
new thread being spawned), or explicitly using the omp_display_affinity() API.
The omp_capture_affinity() function can capture the affinity-format-var in a
char buffer. And omp_set|get_affinity_format() allow the user to set|get the
affinity-format-var explicitly at runtime. omp_capture_affinity() and
omp_get_affinity_format() both return the number of characters needed to hold
the entire string it tried to make (not including NULL character). If not
enough buffer space is available,
both these functions truncate their output.
Differential Revision: https://reviews.llvm.org/D55148
llvm-svn: 349089
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix two build issues:
1) Recent commit 348756 accidentally included Unix clang compilers
to use immintrin.h when only clang-cl should be using it leading
to the following error:
openmp-llvm/runtime/src/kmp_lock.cpp:2035:25: error: always_
inline function '_xbegin' requires target feature 'rtm', but would be inlined into function
'__kmp_test_adaptive_lock_only' that is compiled without support for 'rtm'
kmp_uint32 status = _xbegin();
This patch changes the guard to use immintrin.h to only use clang-cl instead of all clang
2) gcc-8 gives a warning about multiline comment in kmp_runtime.cpp:
This patch just changes it to a two line comment
openmp-llvm/runtime/src/kmp_runtime.cpp:7697:8: warning: multi-line comment [-Wcomment]
#endif // KMP_OS_LINUX || KMP_OS_DRAGONFLY || KMP_OS_FREEBSD || KMP_OS_NETBSD \
llvm-svn: 348783
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This patch permits OpenMP to build and work (with both gcc and clang) on OpenBSD. It mostly follows what was done for FreeBSD and NetBSD, except OpenBSD does not have pthread_getattr_np support, so it follows OS X in that one instance.
Reviewers: #openmp, krytarowski
Reviewed By: krytarowski
Subscribers: guansong, jfb, emaste, mgorny, krytarowski, #openmp
Tags: #openmp
Differential Revision: https://reviews.llvm.org/D34280
llvm-svn: 348726
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Additions mostly follow FreeBSD and NetBSD and are not intrusive.
There is similar patch for OpenBSD: https://reviews.llvm.org/D34280
The -lm was being omitted due to -Wl,--as-needed in cmake rule, similar patch is in freebsd-ports/devel/llvm-devel port.
Simple OpenMP programs compile and work as expected:
$ clang-devel ~/omp_hello.c -fopenmp -I/usr/local/llvm-devel/include
$ LD_LIBRARY_PATH=/usr/local/llvm-devel/lib OMP_NUM_THREADS=100 ./a.out
The assertion in LLVMgold.so when -fopenmp was used together with -flto in 20170524 snapshot is no longer triggered on current svn-trunk and works fine as in llvm-4.0 with our local patches.
Reviewers: #openmp, krytarowski
Reviewed By: krytarowski
Subscribers: dexonsmith, jfb, krytarowski, guansong, gregrodgers, emaste, mgorny, mehdi_amini
Differential Revision: https://reviews.llvm.org/D35129
llvm-svn: 348725
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is low probability that array th_hot_teams can be
accessed out of bound (when many nested levels are requested
to keep hot teams via KMP_HOT_TEAMS_MAX_LEVEL). The patch
adds the check of index that fixes the problem.
Patch by Andrey Churbanov
Differential Revision: https://reviews.llvm.org/D54950
llvm-svn: 347800
|
|
|
|
|
|
|
|
| |
Do not write to internal structure if it keeps same value.
Differential Revision: https://reviews.llvm.org/D54305
llvm-svn: 346862
|
|
|
|
|
|
|
|
| |
Patch by samuel.thibault@ens-lyon.org
Differential Revision: https://reviews.llvm.org/D54079
llvm-svn: 346310
|
|
|
|
| |
llvm-svn: 343869
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initializing an ompt_data_t object using the pointer union member is potentially
unsafe in 32-bit programs. This change fixes the issue
by using the constant, ompt_data_none.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D52046
llvm-svn: 343785
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some types and callback signatures have changed from TR6 to TR7.
Major changes (only adding signatures and stubs):
(-remove idle callback) done by D48362
-add reduction and dispatch callback
-add get_task_memory and finalize_tool runtime entry points
-ompt_invoker_t becomes ompt_parallel_flag_t
-more types of sync_regions
Patch provided by Simon Convent
Reviewers: hbae, protze.joachim
Differential Revision: https://reviews.llvm.org/D50774
llvm-svn: 341834
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implemented omp_alloc, omp_free, omp_{set,get}_default_allocator entries,
and OMP_ALLOCATOR environment variable.
Added support for HBW memory on Linux if libmemkind.so library is accessible
(dynamic library only, no support for static libraries).
Only used stable API (hbwmalloc) of the memkind library
though we may consider using experimental API in future.
The ICV def-allocator-var is implemented per implicit task similar to
place-partition-var. In the absence of a requested allocator, the uses the
default allocator.
Predefined allocators (the only ones currently available) are made similar
for C and Fortran, - pointers (long integers) with values 1 to 8.
Patch by Andrey Churbanov
Differential Revision: https://reviews.llvm.org/D51232
llvm-svn: 341687
|
|
|
|
|
|
|
|
|
|
| |
If hot teams are not being used, this code could seg fault without the added
check, and does so when composability is used in conjunction with nesting.
The fix prevents the segfault.
Differential Revision: https://reviews.llvm.org/D50649
llvm-svn: 340629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch cleans up unused functions, variables, sign compare issues, and
addresses some -Warning flags which are now enabled including -Wcast-qual.
Not all the warning flags in LibompHandleFlags.cmake are enabled, but some
are with this patch.
Some __kmp_gtid_from_* macros in kmp.h are switched to static inline functions
which allows us to remove the awkward definition of KMP_DEBUG_ASSERT() and
KMP_ASSERT() macros which used the comma operator. This had to be done for the
innumerable -Wunused-value warnings related to KMP_DEBUG_ASSERT()
Differential Revision: https://reviews.llvm.org/D49105
llvm-svn: 339393
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Remove unnecessary data from list node structure
2) Remove timerPair in favor of pushing/popping explicitTimers.
This way, nested timers will work properly.
3) Fix #pragma omp critical timers
4) Add histogram capability
5) Add KMP_STATS_FILE formatting capability
6) Have time partitioned into serial & parallel by introducing
partitionedTimers::exchange(). This also counts the number of serial regions
in the executable.
7) Fix up the timers around OMP loops so that scheduling overhead and work are
both counted correctly.
8) Fix up the iterations statistics so they count the number of iterations the
thread receives at each loop scheduling event
9) Change timers so there is only one RDTSC read per event change
10) Fix up the outdated comments for the timers
Differential Revision: https://reviews.llvm.org/D49699
llvm-svn: 338276
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes possibly invalid access to the internal data structure during
library shutdown. In a heavily oversubscribed situation, the library shutdown
sequence can reach the point where resources are deallocated while there still
exist threads in their final spinning loop. The added loop in
__kmp_internal_end() checks if there are such busy-waiting threads and blocks
the shutdown sequence if that is the case. Two versions of kmp_wait_template()
are now used to minimize performance impact.
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D49452
llvm-svn: 337486
|
|
|
|
| |
llvm-svn: 336575
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces the logic implementing hierarchical scheduling.
First and foremost, hierarchical scheduling is off by default
To enable, use -DLIBOMP_USE_HIER_SCHED=On during CMake's configure stage.
This work is based off if the IWOMP paper:
"Workstealing and Nested Parallelism in SMP Systems"
Hierarchical scheduling is the layering of OpenMP schedules for different layers
of the memory hierarchy. One can have multiple layers between the threads and
the global iterations space. The threads will go up the hierarchy to grab
iterations, using possibly a different schedule & chunk for each layer.
[ Global iteration space (0-999) ]
(use static)
[ L1 | L1 | L1 | L1 ]
(use dynamic,1)
[ T0 T1 | T2 T3 | T4 T5 | T6 T7 ]
In the example shown above, there are 8 threads and 4 L1 caches begin targeted.
If the topology indicates that there are two threads per core, then two
consecutive threads will share the data of one L1 cache unit. This example
would have the iteration space (0-999) split statically across the four L1
caches (so the first L1 would get (0-249), the second would get (250-499), etc).
Then the threads will use a dynamic,1 schedule to grab iterations from the L1
cache units. There are currently four supported layers: L1, L2, L3, NUMA
OMP_SCHEDULE can now read a hierarchical schedule with this syntax:
OMP_SCHEDULE='EXPERIMENTAL LAYER,SCHED[,CHUNK][:LAYER,SCHED[,CHUNK]...]:SCHED,CHUNK
And OMP_SCHEDULE can still read the normal SCHED,CHUNK syntax from before
I've kept most of the hierarchical scheduling logic inside kmp_dispatch_hier.h
to try to keep it separate from the rest of the code.
Differential Revision: https://reviews.llvm.org/D47962
llvm-svn: 336571
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are preliminary changes that attempt to use C++11 Atomics in the runtime.
We are expecting better portability with this change across architectures/OSes.
Here is the summary of the changes.
Most variables that need synchronization operation were converted to generic
atomic variables (std::atomic<T>). Variables that are updated with combined CAS
are packed into a single atomic variable, and partial read/write is done
through unpacking/packing
Patch by Hansang Bae
Differential Revision: https://reviews.llvm.org/D47903
llvm-svn: 336563
|
|
|
|
|
|
|
|
| |
Rename ompt_frame_t to omp_frame_t, as defined in the spec.
Differential Revision: https://reviews.llvm.org/D43568
llvm-svn: 333367
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
implicit_task_end callbacks in nested parallel regions did not always give the
correct thread_num, since the inner parallel region may have already been
finalized.
Now, the thread_num is stored at the beginning of the implicit task and
retrieved at the end, whenever necessary.
A testcase was added as well.
Differential Revision: https://reviews.llvm.org/D46260
llvm-svn: 331632
|