summaryrefslogtreecommitdiffstats
path: root/openmp/runtime/src/kmp_affinity.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Fix minor formatting issuesJonathan Peyton2017-06-011-1/+1
| | | | | | | | | | | Some code was restructured to move it under KMP_DEBUG. The rest is formatting changes to fix some things broken by clang-format Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D33744 llvm-svn: 304438
* Fix for KMP_AFFINITY=disabled and KMP_TOPOLOGY_METHOD=hwlocJonathan Peyton2017-05-311-1/+4
| | | | | | | | | | | | With these settings, the create_hwloc_map() method was being called causing an assert(). After some consideration, it was determined that disabling affinity explicitly should just disable hwloc as well. i.e., KMP_AFFINITY overrides KMP_TOPOLOGY_METHOD. This lets the user know that the Hwloc mechanism is being ignored when KMP_AFFINITY=disabled. Differential Revision: https://reviews.llvm.org/D33208 llvm-svn: 304344
* Fix for KMP_AFFINITY=respect with multiple processor groupsJonathan Peyton2017-05-151-3/+2
| | | | | | | | | An assert() was being tripped when KMP_AFFINITY=respect + Multiple Processor Groups. Let __kmp_affinity_create_proc_group_map() function be able to create address2os object which contains a single group by deleting restriction that process affinity mask must span multiple groups. llvm-svn: 303101
* Clang-format and whitespace cleanup of source codeJonathan Peyton2017-05-121-4711/+4265
| | | | | | | | | | | | | This patch contains the clang-format and cleanup of the entire code base. Some of clang-formats changes made the code look worse in places. A best effort was made to resolve the bulk of these problems, but many remain. Most of the problems were mangling line-breaks and tabbing of comments. Patch by Terry Wilmarth Differential Revision: https://reviews.llvm.org/D32659 llvm-svn: 302929
* Fix Hwloc API IncompatibilityJonathan Peyton2017-04-251-4/+4
| | | | | | | | | | | Older Hwloc libraries (< 1.10.0) don't offer the HWLOC_OBJ_NUMANODE nor HWLOC_OBJ_PACKAGE types. Instead they are named HWLOC_OBJ_NODE and HWLOC_OBJ_SOCKET instead. This patch just defines the newer names based on the older names when using an older Hwloc. Differential Revision: https://reviews.llvm.org/D32496 llvm-svn: 301349
* KMP_HW_SUBSET extended with NUMA support when HWLOC enabledAndrey Churbanov2017-04-131-75/+638
| | | | | | Differential Revision: https://reviews.llvm.org/D31600 llvm-svn: 300220
* Fix incorrect initial value of __kmp_affinity_type.Jonathan Peyton2017-03-201-0/+1
| | | | | | | | | | | | | Affinity initialization code expects __kmp_affinity_type has the value affinity_default by default, but the cleanup code does not properly set the value back to affinity_default. This may introduce some issues when multiple roots are trying to initialize/uninitialize the runtime successively. Patch by Hansang Bae Differential Revision: https://reviews.llvm.org/D31012 llvm-svn: 298313
* Printing OS thread id, when KMP_AFFINITY is set.Jonathan Peyton2017-01-271-6/+4
| | | | | | | | Patch by Vishakha Agrawal Differential Revision: https://reviews.llvm.org/D28873 llvm-svn: 293315
* kmp_affinity: Fix check if specific bit is setJonas Hahnfeld2017-01-121-1/+1
| | | | | | | | | | | | | | | | Clang 4.0 trunk warns: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses] This points to a potential bug if the code really wants to check if the single bit is not set: If for example (buf.edx >> 9) = 2 (has any bit set except the least significant one), 'logical not' will return 0 which stays 0 after the 'bitwise and'. To do this correctly we first need to evaluate the 'bitwise and'. In that case it returns 2 & 1 = 0 which after the 'logical not' evaluates to 1. Differential Revision: https://reviews.llvm.org/D28599 llvm-svn: 291764
* Introduce dynamic affinity dispatch capabilitiesJonathan Peyton2016-11-141-56/+42
| | | | | | | | | | | | | | | | | | | | | | | | | This set of changes enables the affinity interface (Either the preexisting native operating system or HWLOC) to be dynamically set at runtime initialization. The point of this change is that we were seeing performance degradations when using HWLOC. This allows the user to use the old affinity mechanisms which on large machines (>64 cores) makes a large difference in initialization time. These changes mostly move affinity code under a small class hierarchy: KMPAffinity class Mask {} KMPNativeAffinity : public KMPAffinity class Mask : public KMPAffinity::Mask KMPHwlocAffinity class Mask : public KMPAffinity::Mask Since all interface functions (for both affinity and the mask implementation) are virtual, the implementation can be chosen at runtime initialization. Differential Revision: https://reviews.llvm.org/D26356 llvm-svn: 286890
* Fix bitmask upper bounds checkJonathan Peyton2016-09-121-15/+16
| | | | | | | | | | | | Rather than checking KMP_CPU_SETSIZE, which doesn't exist when using Hwloc, we use the get_max_proc() function which can vary based on the operating system. For example on Windows with multiple processor groups, it might be the case that the highest bit possible in the bitmask is not equal to the number of hardware threads on the machine but something higher than that. Differential Revision: https://reviews.llvm.org/D24206 llvm-svn: 281245
* Move function into cpp file under KMP_AFFINITY_SUPPORTED guard.Jonathan Peyton2016-09-021-0/+25
| | | | | | | | | | | | When affinity isn't supported, __kmp_affinity_compact doesn't exist. The problem is that in kmp_affinity.h there is a function which uses it without the proper KMP_AFFINITY_SUPPORTED guard around it. The compiler was smart enough to ignore it and the function __kmp_affinity_cmp_Address_child_num which relies on it, but I think it is cleaner to have it under the proper guard. Since the function is only used in the kmp_affinity.cpp file and there aren't any plans to have it elsewhere. I have moved it there. llvm-svn: 280542
* Replace a bad instance of __kmp_free() with KMP_CPU_FREE_ARRAY() macro.Jonathan Peyton2016-09-021-1/+1
| | | | llvm-svn: 280530
* Fixed x2APIC discovery for 256-processor architectures.Andrey Churbanov2016-08-051-3/+3
| | | | | | | | Mask for value read from ebx register returned by CPUID expanded to 0xFFFF. Differential Revision: https://reviews.llvm.org/D23203 llvm-svn: 277825
* Make balanced affinity work on AArch64.Paul Osmialowski2016-07-291-57/+141
| | | | | | | | | | | This patch enables balanced affinity on machines that do not have hardware threads and have cores clustered into packages. In facts, balacing algorithm could be generalized for any arrangement with at least two levels of hierarchy (depth > 1). Differential Revision: https://reviews.llvm.org/D22365 llvm-svn: 277212
* D22136: Memory leaks fixed by adding missed __kmp_free() callsAndrey Churbanov2016-07-081-0/+2
| | | | llvm-svn: 274850
* Improvements to process affinity mask settingJonathan Peyton2016-06-211-51/+102
| | | | | | | | | | | | A couple improvements: 1) Add ability to limit fullMask size when KMP_HW_SUBSET limits resources. 2) Make KMP_HW_SUBSET work for affinity_none, and only limit fullMask in this case. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21528 llvm-svn: 273278
* Change hwloc discovery algorithm to print topology only for accessible resourcesJonathan Peyton2016-06-161-17/+29
| | | | | | | | | | | | | Change hwloc discovery algorithm to print topology for only accessible resources, and report uniformity correspondingly, similar to what other topology discovery algorithms do. Fixes minor inconsistency in total topology reported and resources used for threads binding in case hwloc used. Patch by Andrey Churbanov. Differential Revision: http://reviews.llvm.org/D21389 llvm-svn: 272952
* Fixed missing memory cleanup in __kmp_affinity_create_hwloc_map()Jonathan Peyton2016-06-161-0/+2
| | | | | | | | | | | Cleanup: fixed missing memory cleanup in couple of corner cases. Fixes possible memory leak in some corner cases Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21355 llvm-svn: 272946
* Deprecate KMP_PLACE_THREADS and rename as KMP_HW_SUBSETJonathan Peyton2016-06-161-4/+4
| | | | | | | | | | | | | Deprecate KMP_PLACE_THREADS and rename it to KMP_HW_SUBSET due to confusion about its purpose and function among users. KMP_HW_SUBSET is an environment variable which allows users to easily pick a subset of the hardware topology to use. e.g., KMP_HW_SUBSET=30c,2t means use 30 cores, 2 threads per core. Patch by Andrey Churbanov Differential Revision: http://reviews.llvm.org/D21340 llvm-svn: 272937
* Affinity mask processing improvementsJonathan Peyton2016-06-131-49/+44
| | | | | | | | | | | | Remove static specifier from var fullMask and remove kmp_get_fullMask() routine. When iterating through procs in a mask, always check if proc is in fullMask (this check was missing in a few places). Patch by Brian Bliss. Differential Revision: http://reviews.llvm.org/D21300 llvm-svn: 272589
* Hwloc refactoring patchJonathan Peyton2016-06-131-106/+122
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These changes remove the hwloc_topology_ignore_type function which doesn't exist in the hwloc 2.0 API. In the existing code, the topology extracted from hwloc has the cache levels stripped out and then assumes the final stripped topology follows the typical three-level topology: packages -> cores -> HW threads. But the code is doing unclean manipulations to determine at what level those resources are located and also assumes too much about what hwloc is detecting (there could be intermediate levels in between socket and core for instance). This new way of extracting the topology doesn't strip out any hardware objects that hwloc detects. It does not assume the three level topology, and instead searches for the relevant three levels within the topology for each bit of information using hwloc interface functions. i.e., the three level topology subset that our affinity code is interested in is extracted from the hwloc topology tree directly. For example, the new __kmp_hwloc_get_nobjs_under_obj function gives the user the number of cores under a socket reliably without worrying if there are unexpected objects between the socket object and core object in the hwloc topology structure. Also, now that all topology information is kept, there are also possibilities of using the caches/numa nodes to determine more sophisticated affinity settings in the future. There is also some cleanup code added for the destruction of the __kmp_hwloc_topology object. Differential Revision: http://reviews.llvm.org/D21195 llvm-svn: 272565
* Remove architecture dependent Hwloc DEBUG sectionJonathan Peyton2016-04-251-30/+0
| | | | | | | This debug sections's functionality can be replicated using the environment variable KMP_TOPOLOGY_METHOD with different values and KMP_AFFINITY=verbose llvm-svn: 267472
* Fix buffer problem with printing long Hwloc affinity maskJonathan Peyton2016-04-251-1/+1
| | | | | | | | This change has the hwloc_bitmap_list_snprintf() function use the entire buffer to print the mask. There is no need to shorten the buffer length by 7. It only needs to be shortened by one byte. llvm-svn: 267470
* New API for restoring current thread's affinity to init affinity of applicationJonathan Peyton2016-01-121-0/+38
| | | | | | | | | | | | | | | This new API, int kmp_set_thread_affinity_mask_initial(), is available for use by other parallel runtime libraries inside a possibly OpenMP-registered thread. This entry point restores the current thread's affinity mask to the affinity mask of the application when it first began. If -1 is returned it can be assumed that either the thread hasn't called affinity initialization or that the thread isn't registered with the OpenMP library. If 0 is returned then, then the call was successful. Any return value greater than zero indicates an error occurred when setting affinity. Differential Revision: http://reviews.llvm.org/D15867 llvm-svn: 257489
* Adding Hwloc library option for affinity mechanismJonathan Peyton2015-11-301-122/+517
| | | | | | | | | | | | | | | | | | | These changes allow libhwloc to be used as the topology discovery/affinity mechanism for libomp. It is supported on Unices. The code additions: * Canonicalize KMP_CPU_* interface macros so bitmask operations are implementation independent and work with both hwloc bitmaps and libomp bitmaps. So there are new KMP_CPU_ALLOC_* and KMP_CPU_ITERATE() macros and the like. These are all in kmp.h and appropriately placed. * Hwloc topology discovery code in kmp_affinity.cpp. This uses the hwloc interface to create a libomp address2os object which the rest of libomp knows how to handle already. * To build, use -DLIBOMP_USE_HWLOC=on and -DLIBOMP_HWLOC_INSTALL_DIR=/path/to/install/dir [default /usr/local]. If CMake can't find the library or hwloc.h, then it will tell you and exit. Differential Revision: http://reviews.llvm.org/D13991 llvm-svn: 254320
* Improvements to machine_hierarchy code for re-sizingJonathan Peyton2015-11-091-3/+4
| | | | | | | | | | | | | These changes include: 1) Machine hierarchy now uses the base_num_threads field to indicate the maximum number of threads the current hierarchy can handle without a resize. 2) In __kmp_get_hierarchy, we need to get depth after any potential resize is done. 3) Cleanup of hierarchy resize code to support 1 above. Differential Revision: http://reviews.llvm.org/D14455 llvm-svn: 252475
* Fix OMP_PLACES negation operator parsing (!place)Jonathan Peyton2015-10-191-1/+1
| | | | | | | Just moved the *scan++ line up before the recursive call. Otherwise, infinite recursion occurs and leads to a segmentation fault. llvm-svn: 250729
* Added sockets to the syntax of KMP_PLACE_THREADS environment variable.Jonathan Peyton2015-10-081-22/+37
| | | | | | | | | | | | | | | | | | | Added (optional) sockets to the syntax of the KMP_PLACE_THREADS environment variable. Some limitations: * The number of sockets and then optional offset should be specified first (before other parameters). * The letter designation is mandatory for sockets and then for other parameters. * If number of cores is specified first, then the number of sockets is defaulted to all sockets on the machine; also, the old syntax is partially supported if sockets are skipped. * If number of threads per core is specified first, then the number of sockets and cores per socket are defaulted to all sockets and all cores per socket respectively. * The number of cores per socket cannot be specified before sockets or after threads per core. * The number of threads per core can be specified before or after core-offset (old syntax required it to be before core-offset); * Parameters delimiter can be: empty, comma, lower-case x; * Spaces are allowed around numbers, around letters, around delimiter. Approximate shorthand specification: KMP_PLACE_THREADS="[num_sockets(S|s)[[delim]offset(O|o)][delim]][num_cores_per_socket(C|c)[[delim]offset(O|o)][delim]][num_threads_per_core(T|t)]" Differential Revision: http://reviews.llvm.org/D13175 llvm-svn: 249708
* Fix memory corruption in Windows debug libraryJonathan Peyton2015-09-251-5/+5
| | | | | | | | This patch adjusts the buffer size when reducing the buffer used for printing. This solves the memory corruption in Windows debug library, and potential memory corruption in other builds. llvm-svn: 248588
* Fix depth field bug and resize() function in hierarchical barrierJonathan Peyton2015-09-101-6/+3
| | | | | | | | | | | This is a follow up to the hierarchy cleanup patch. Added some clarifying comments to hierarchy_info. Fixed a bug with the depth field not being updated cleanly during a resize. Fixed resize to first check capacity as determined by maxLevels before actually doing the full resize. Differential Revision: http://reviews.llvm.org/D12562 llvm-svn: 247333
* Cleanup of affinity hierarchy code.Jonathan Peyton2015-09-101-456/+28
| | | | | | | | | | | | Some of this is improvement to code suggested by Hal Finkel. Four changes here: 1.Cleanup of hierarchy code to handle all hierarchy cases whether affinity is available or not 2.Separated this and other classes and common functions out to a header file 3.Added a destructor-like fini function for the hierarchy (and call in __kmp_cleanup) 4.Remove some redundant code that is hopefully no longer needed Differential Revision: http://reviews.llvm.org/D12449 llvm-svn: 247326
* Fix machine topology pruning.Jonathan Peyton2015-08-251-17/+22
| | | | | | | | | | | This patch fixes a bug when eliminating layers in the machine topology (namely cores, and threads). Before this patch, if a user specifies using only one thread per socket, then affinity is not set properly due to bad topology pruning. Differential Revision: http://reviews.llvm.org/D11158 llvm-svn: 245966
* Allow machine hierarchy expansionJonathan Peyton2015-06-221-10/+78
| | | | | | | | | This fix allows the machine hierarchy to be expanded in case it needs to handle more threads. It adds a resize function to accomplish this. Differential Revision: http://reviews.llvm.org/D9900 llvm-svn: 240292
* Re-enable Visual Studio Builds.Jonathan Peyton2015-06-221-3/+3
| | | | | | | | | I tried to compile with Visual Studio using CMake and found these two sections of code causing problems for Visual Studio. The first one removes the use of variable length arrays by instead using KMP_ALLOCA(). The second part eliminates a redundant cpuid assembly call by using the already existing __kmp_x86_cpuid() call instead. llvm-svn: 240290
* Apply name change to src/* files.Jonathan Peyton2015-06-011-1/+1
| | | | | | | | | These changes are mostly in comments, but there are a few that aren't. Change libiomp5 => libomp everywhere. One internal function name is changed in kmp_gsupport.c, and in kmp_i18n.c, the static char[] variable 'name' is changed to "libomp". llvm-svn: 238712
* Fix comment about balanced affinityJonathan Peyton2015-05-271-1/+1
| | | | | | | | A while back, Hal mentioned fixing a comment concerning balanced affinity. http://lists.cs.uiuc.edu/pipermail/openmp-dev/2014-December/000358.html I forgot about fixing it until now, but now is better than never. llvm-svn: 238378
* The generation of the hierarchy used by hierarchical barrier improved in how ↵Andrey Churbanov2015-04-131-43/+78
| | | | | | the generation reacts to affinity set to none, or disabled, or no affinity available, or oversubscription. Some cleanup actions based on review comments to follow: need to use meaningful names instead of digital constants, e.g. use enumerators. llvm-svn: 234775
* Replace some unsafe API calls with safe alternatives on Windows, prepare ↵Andrey Churbanov2015-04-021-19/+19
| | | | | | code for similar actions on other platforms - wrap unsafe API calls into macros. llvm-svn: 233915
* Eliminated the write to depth field of the machine_hierarchy data structure ↵Andrey Churbanov2015-04-021-9/+7
| | | | | | in __kmp_get_hierarchy(), thus fixing race condition. Now local variable used by each thread. llvm-svn: 233914
* issuing of incorrect warning fixedAndrey Churbanov2015-03-101-4/+4
| | | | llvm-svn: 231779
* cleanup: usages of mask size wrapped into macrosAndrey Churbanov2015-03-101-2/+2
| | | | llvm-svn: 231775
* changed unsigned types to signed - caused by comments of Hal Finkel on one ↵Andrey Churbanov2015-03-101-3/+3
| | | | | | of earlier patches llvm-svn: 231773
* minor change: comment improvedAndrey Churbanov2015-03-051-1/+1
| | | | llvm-svn: 231381
* Fixed memory corruption problem.Andrey Churbanov2015-02-101-0/+4
| | | | llvm-svn: 228736
* enable environment variable KMP_PLACE_THREADS also for non-MIC architecturesAndrey Churbanov2015-01-291-10/+3
| | | | llvm-svn: 227467
* fixing typo in error messageAndrey Churbanov2015-01-291-1/+1
| | | | llvm-svn: 227451
* Comments only: removing the Revision and Date svn variables from the top of ↵Andrey Churbanov2015-01-271-2/+0
| | | | | | all the source files. llvm-svn: 227207
* Enables a cpuid leaf 4 check for non-MIC x86 architectures.Andrey Churbanov2015-01-271-21/+14
| | | | llvm-svn: 227204
* Removes some unused variables (__kmp_ht_*) and changes__kmp_ncores and ↵Andrey Churbanov2015-01-271-17/+9
| | | | | | __kmp_nThreadsPerCore to static globals within kmp_affinity.cpp. llvm-svn: 227201
OpenPOWER on IntegriCloud