summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [OpenMP] Add function attribute for triggering data sharing.Gheorghe-Teodor Bercea2017-12-121-0/+2
| | | | | | | | | | | | | | | | Summary: The backend should only emit data sharing code for the cases where it is needed. A new function attribute is used by Clang to enable data sharing only for the cases where OpenMP semantics require it and there are variables that need to be shared. Reviewers: hfinkel, Hahnfeld, ABataev, carlo.bertolli, caomhin Reviewed By: ABataev Subscribers: cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D41123 llvm-svn: 320527
* [Driver][CodeGen] Add -mprefer-vector-width driver option and attribute ↵Craig Topper2017-12-111-0/+5
| | | | | | | | | | | | | | | | during CodeGen. This adds a new command line option -mprefer-vector-width to specify a preferred vector width for the vectorizers. Valid values are 'none' and unsigned integers. The driver will check that it meets those constraints. Specific supported integers will be managed by the targets in the backend. Clang will take the value and add it as a new function attribute during CodeGen. This represents the alternate direction proposed by Sanjay in this RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-November/118734.html The syntax here matches gcc, though gcc treats it as an x86 specific command line argument. gcc only allows values of 128, 256, and 512. I'm not having clang check any values. Differential Revision: https://reviews.llvm.org/D40230 llvm-svn: 320419
* Hardware-assisted AddressSanitizer (clang part).Evgeniy Stepanov2017-12-095-4/+25
| | | | | | | | | | | | | | Summary: Driver, frontend and LLVM codegen for HWASan. A clone of ASan, basically. Reviewers: kcc, pcc, alekseyshl Subscribers: srhines, javed.absar, cfe-commits Differential Revision: https://reviews.llvm.org/D40936 llvm-svn: 320232
* [CodeGen][X86] Fix handling of __fp16 vectors.Akira Hatanaka2017-12-093-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit fixes a bug in IRGen where it generates completely broken code for __fp16 vectors on X86. For example when the following code is compiled: half4 hv0, hv1, hv2; // these are vectors of __fp16. void foo221() { hv0 = hv1 + hv2; } clang generates the following IR, in which two i16 vectors are added: @hv1 = common global <4 x i16> zeroinitializer, align 8 @hv2 = common global <4 x i16> zeroinitializer, align 8 @hv0 = common global <4 x i16> zeroinitializer, align 8 define void @foo221() { %0 = load <4 x i16>, <4 x i16>* @hv1, align 8 %1 = load <4 x i16>, <4 x i16>* @hv2, align 8 %add = add <4 x i16> %0, %1 store <4 x i16> %add, <4 x i16>* @hv0, align 8 ret void } To fix the bug, this commit uses the code committed in r314056, which modified clang to promote and truncate __fp16 vectors to and from float vectors in the AST. It also fixes another IRGen bug where a short value is assigned to an __fp16 variable without any integer-to-floating-point conversion, as shown in the following example: __fp16 a; short b; void foo1() { a = b; } @b = common global i16 0, align 2 @a = common global i16 0, align 2 define void @foo1() #0 { %0 = load i16, i16* @b, align 2 store i16 %0, i16* @a, align 2 ret void } rdar://problem/20625184 Differential Revision: https://reviews.llvm.org/D40112 llvm-svn: 320215
* [OPENMP] Simplify codegen for loop iteration variables in loop preamble.Alexey Bataev2017-12-081-1/+6
| | | | | | | Initial patch could cause trouble in the optimized code because of the incorrectly generated lifetime intrinsics. llvm-svn: 320191
* [ubsan] array-bounds: Ignore params with constant sizeVedant Kumar2017-12-081-8/+0
| | | | | | | | | | This is a follow-up to r320128. Eli pointed out that there is some gray area in the language standard about whether the constant size is exact, or a lower bound. https://reviews.llvm.org/D40940 llvm-svn: 320185
* [OPENMP] Initial codegen for `target teams distribute` directive.Alexey Bataev2017-12-083-14/+58
| | | | | | Host + default devices codegen for `target teams distribute` directive. llvm-svn: 320149
* [Blocks] Inherit sanitizer options from parent declVedant Kumar2017-12-081-1/+3
| | | | | | | | | | | | There is no way to apply sanitizer suppressions to ObjC blocks. A reasonable default is to have blocks inherit their parent's sanitizer options. rdar://32769634 Differential Revision: https://reviews.llvm.org/D40668 llvm-svn: 320132
* [ubsan] Use pass_object_size info in bounds checksVedant Kumar2017-12-082-0/+59
| | | | | | | | | Teach UBSan's bounds check to opportunistically use pass_object_size information to check array accesses. rdar://33272922 llvm-svn: 320128
* [driver] Set the 'simulator' environment for Darwin when compiling forAlex Lorenz2017-12-071-4/+1
| | | | | | | | | | iOS/tvOS/watchOS simulator rdar://35135215 Differential Revision: https://reviews.llvm.org/D40682 llvm-svn: 320073
* CodeGen: Fix invalid bitcasts for memcpyYaxun Liu2017-12-071-4/+4
| | | | | | | | | | | | CreateCoercedLoad/CreateCoercedStore assumes pointer argument of memcpy is in addr space 0, which is not correct and causes invalid bitcasts for triple amdgcn---amdgiz. It is fixed by using alloca addr space instead. Differential Revision: https://reviews.llvm.org/D40806 llvm-svn: 320000
* Fix PR35542: Correct adjusting of private reduction variableJonas Hahnfeld2017-12-061-3/+6
| | | | | | | | | | | | The adjustment is calculated with CreatePtrDiff() which returns the difference in (base) elements. This is passed to CreateGEP() so make sure that the GEP base has the correct pointer type: It needs to be a pointer to the base type, not a pointer to a constant sized array. Differential Revision: https://reviews.llvm.org/D40911 llvm-svn: 319931
* [OPENMP] Initial codegen for `teams distribute simd` directive.Alexey Bataev2017-12-061-13/+23
| | | | | | Host + default devices codegen for `teams distribute simd` directive. llvm-svn: 319896
* [OpenCL] Fix layering violation by getOpenCLTypeAddrSpaceSven van Haastregt2017-12-061-3/+3
| | | | | | | | | | | | Commit 7ac28eb0a5 / r310911 ("[OpenCL] Allow targets to select address space per type", 2017-08-15) made Basic depend on AST, introducing a circular dependency. Break this dependency by adding the OpenCLTypeKind enum in Basic and map from AST types to this enum in ASTContext. Differential Revision: https://reviews.llvm.org/D40838 llvm-svn: 319883
* [OPENMP] Fix PR35486: crash when collapsing loops with dependent iteration ↵Alexey Bataev2017-12-041-0/+3
| | | | | | | | | | spaces. Though it is incorrect from point of view of OpenMP standard to have dependent iteration space in OpenMP loops, compiler should not crash. Patch fixes this problem. llvm-svn: 319700
* [OpenMP] Initial implementation of code generation for pragma 'teams ↵Carlo Bertolli2017-12-041-12/+22
| | | | | | | | | | distribute parallel for simd' on host https://reviews.llvm.org/D40795 This includes regression tests for all associated clauses. llvm-svn: 319696
* [OPENMP] Codegen for `distribute simd` directive.Alexey Bataev2017-12-041-21/+54
| | | | | | Initial codegen support for `distribute simd` directive. llvm-svn: 319661
* Revert "[CodeGen] Add initial support for union members in TBAA"Hal Finkel2017-12-034-67/+32
| | | | | | | | This reverts commit r319413. See PR35503. We can't use "union member" as the access type here like this. llvm-svn: 319629
* [CodeGen] fix mapping from fmod calls to frem instructionSanjay Patel2017-12-021-14/+18
| | | | | | Similar to D40044 and discussed in D40594. llvm-svn: 319619
* [CodeGen] remove stale comment; NFCSanjay Patel2017-12-021-1/+1
| | | | | | | The libm functions with LLVM intrinsic twins were moved above this blob with: https://reviews.llvm.org/rL319593 llvm-svn: 319618
* [CodeGen] convert math libcalls/builtins to equivalent LLVM intrinsicsSanjay Patel2017-12-011-93/+173
| | | | | | | | | | | | | | | | There are 20 LLVM math intrinsics that correspond to mathlib calls according to the LangRef: http://llvm.org/docs/LangRef.html#standard-c-library-intrinsics We were only converting 3 mathlib calls (sqrt, fma, pow) and 12 builtin calls (ceil, copysign, fabs, floor, fma, fmax, fmin, nearbyint, pow, rint, round, trunc) to their intrinsic-equivalents. This patch pulls the transforms together and handles all 20 cases. The switch is guarded by a check for const-ness to make sure we're not doing the transform if errno could possibly be set by the libcall or builtin. Differential Revision: https://reviews.llvm.org/D40044 llvm-svn: 319593
* [OPENMP] Emit `__tgt_target_teams` for all teams directives.Alexey Bataev2017-12-011-26/+26
| | | | | | | | Previously we emitted `__tgt_target_teams` only for standalone teams directives. This patch allows emit this function for all teams-based directives. llvm-svn: 319585
* Mark all library options as hidden.Zachary Turner2017-12-012-4/+5
| | | | | | | | | | | | | | | | | These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505
* [CodeGen] Add initial support for union members in TBAAIvan A. Kosarev2017-11-304-32/+67
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The basic idea behind this patch is that since in strict aliasing mode all accesses to union members require their outermost enclosing union objects to be specified explicitly, then for a couple given accesses to union members of the form p->a.b.c... q->x.y.z... it is known they can only alias if both p and q point to the same union type and offset ranges of members a.b.c... and x.y.z... overlap. Note that the actual types of the members do not matter. Specifically, in this patch we do the following: * Make unions to be valid TBAA base access types. This enables generation of TBAA type descriptors for unions. * Encode union types as structures with a single member of a special "union member" type. Currently we do not encode information about sizes of types, but conceptually such union members are considered to be of the size of the whole union. * Encode accesses to direct and indirect union members, including member arrays, as accesses to these special members. All accesses to members of a union thus get the same offset, which is the offset of the union they are part of. This means the existing LLVM TBAA machinery is able to handle such accesses with no changes. While this is already an improvement comparing to the current situation, that is, representing all union accesses as may-alias ones, there are further changes planned to complete the support for unions. One of them is storing information about access sizes so we can distinct accesses to non-overlapping union members, including accesses to different elements of member arrays. Another change is encoding type sizes in order to make it possible to compute offsets within constant-indexed array elements. These enhancements will be addressed with separate patches. Differential Revision: https://reviews.llvm.org/D39455 llvm-svn: 319413
* [XRay][clang] Introduce -fxray-always-emit-customeventsDean Michael Berris2017-11-303-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The -fxray-always-emit-customevents flag instructs clang to always emit the LLVM IR for calls to the `__xray_customevent(...)` built-in function. The default behaviour currently respects whether the function has an `[[clang::xray_never_instrument]]` attribute, and thus not lower the appropriate IR code for the custom event built-in. This change allows users calling through to the `__xray_customevent(...)` built-in to always see those calls lowered to the corresponding LLVM IR to lay down instrumentation points for these custom event calls. Using this flag enables us to emit even just the user-provided custom events even while never instrumenting the start/end of the function where they appear. This is useful in cases where "phase markers" using __xray_customevent(...) can have very few instructions, must never be instrumented when entered/exited. Reviewers: rnk, dblaikie, kpw Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D40601 llvm-svn: 319388
* [Coverage] Emit gap areas in braces-optional statements (PR35387)Vedant Kumar2017-11-291-9/+61
| | | | | | | | | | | | Emit a gap area starting after the r-paren location and ending at the start of the body for the braces-optional statements (for, for-each, while, etc). The count for the gap area equal to the body's count. This extends the fix in r317758. Fixes PR35387, rdar://35570345 Testing: stage2 coverage-enabled build of clang, check-clang llvm-svn: 319373
* [EH] Use __CxxFrameHandler3 for C++ EH in MS environmentsReid Kleckner2017-11-291-4/+1
| | | | | | | | Fixes regression introduced by r319297. MSVC environments still use SEH unwind opcodes but they should use the Microsoft C++ EH personality, not the mingw one. llvm-svn: 319363
* [OPENMP] General improvement of handling of `teams distribute`Alexey Bataev2017-11-291-1/+1
| | | | | | | | directive, NFC. Some general improvements in support of `teams distribute` directive. llvm-svn: 319320
* Toolchain: Normalize dwarf, sjlj and seh ehMartell Malone2017-11-292-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a re-apply of r319294. adds -fseh-exceptions and -fdwarf-exceptions flags clang will check if the user has specified an exception model flag, in the absense of specifying the exception model clang will then check the driver default and append the model flag for that target to cc1 -fno-exceptions has a higher priority then specifying the model move __SEH__ macro definitions out of Targets into InitPreprocessor behind the -fseh-exceptions flag move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check remove unused USESEHExceptions from the MinGW Driver fold USESjLjExceptions into a new GetExceptionModel function that gives the toolchain classes more flexibility with eh models Reviewers: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D39673 llvm-svn: 319297
* Revert "Toolchain: Normalize dwarf, sjlj and seh eh"Martell Malone2017-11-292-11/+9
| | | | | | | | This reverts rL319294. The windows sanitizer does not like seh on x86. Will re apply with None type for x86 llvm-svn: 319295
* Toolchain: Normalize dwarf, sjlj and seh ehMartell Malone2017-11-292-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | adds -fseh-exceptions and -fdwarf-exceptions flags clang will check if the user has specified an exception model flag, in the absense of specifying the exception model clang will then check the driver default and append the model flag for that target to cc1 clang cc1 assumes dwarf is the default if none is passed and -fno-exceptions has a higher priority then specifying the model move __SEH__ macro definitions out of Targets into InitPreprocessor behind the -fseh-exceptions flag move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check remove unused USESEHExceptions from the MinGW Driver fold USESjLjExceptions into a new GetExceptionModel function that gives the toolchain classes more flexibility with eh models Reviewers: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D39673 llvm-svn: 319294
* Reland "Fix vtable not receiving hidden visibility when using push(visibility)"Jake Ehrlich2017-11-296-19/+21
| | | | | | | | | | I had to reland this change in order to make the test work on windows This change should resolve https://bugs.llvm.org/show_bug.cgi?id=35022 https://reviews.llvm.org/D39627 llvm-svn: 319269
* [OpenMP] Stable sort Privates to remove non-deterministic orderingMandeep Singh Grang2017-11-281-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes the following failures uncovered by D39245: Clang :: OpenMP/task_firstprivate_codegen.cpp Clang :: OpenMP/task_private_codegen.cpp Clang :: OpenMP/taskloop_firstprivate_codegen.cpp Clang :: OpenMP/taskloop_lastprivate_codegen.cpp Clang :: OpenMP/taskloop_private_codegen.cpp Clang :: OpenMP/taskloop_simd_firstprivate_codegen.cpp Clang :: OpenMP/taskloop_simd_lastprivate_codegen.cpp Clang :: OpenMP/taskloop_simd_private_codegen.cpp Reviewers: rjmccall, ABataev, AndreyChurbanov Reviewed By: rjmccall, ABataev Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D39947 llvm-svn: 319222
* [Target] Make a copy of TargetOptions feature list before sorting during CodeGenCraig Topper2017-11-281-18/+11
| | | | | | | | Currently CodeGen is calling std::sort on the features vector in TargetOptions for every function, but I don't think CodeGen should be modifying TargetOptions. Differential Revision: https://reviews.llvm.org/D40228 llvm-svn: 319195
* Refactor functions PrintTemplateArgumentListSerge Pavlov2017-11-281-7/+3
| | | | | | | | | | | These functions were defined as static members of TemplateSpecializationType. Now they are moved to namespace level. Previously there were different implementations for lists containing TemplateArgument and TemplateArgumentLoc, now these implementations share the same code. This change is a result of refactoring patch D40508. NFC. llvm-svn: 319178
* [OPENMP] Codegen for `distribute parallel for simd` directive.Alexey Bataev2017-11-271-7/+5
| | | | | | | Initial codegen for `#pragma omp distribute parallel for simd` directive and its clauses. llvm-svn: 319079
* [OPENMP] Improve handling of cancel directives in target-basedAlexey Bataev2017-11-272-6/+11
| | | | | | | | | constructs, NFC. Improved handling of cancel|cancellation point directives inside target-based for directives. llvm-svn: 319046
* [CodeGen] Collect information about sizes of accesses and access types for TBAAIvan A. Kosarev2017-11-275-55/+99
| | | | | | | | | | | | | | The information about access and type sizes is necessary for producing TBAA metadata in the new size-aware format. With this patch, D39955 and D39956 in place we should be able to change CodeGenTBAA::createScalarTypeNode() and CodeGenTBAA::getBaseTypeInfo() to generate metadata in the new format under the -new-struct-path-tbaa command-line option. For now, this new information remains unused. Differential Revision: https://reviews.llvm.org/D40176 llvm-svn: 319012
* [OPENMP] Add support for cancel constructs in `target teams distributeAlexey Bataev2017-11-221-1/+5
| | | | | | | | | parallel for`. Add support for cancel/cancellation point directives inside `target teams distribute parallel for` directives. llvm-svn: 318881
* [OPENMP] Add support for cancel constructs in [teams] distributeAlexey Bataev2017-11-221-7/+17
| | | | | | | | | parallel for directives. Added codegen/sema support for cancel constructs in [teams] distribute parallel for directives. llvm-svn: 318872
* Revert "[CodeGen] Fix vtable not receiving hidden visibility when using ↵Petr Hosek2017-11-226-21/+19
| | | | | | | | push(visibility)" This reverts commit r318853: tests are failing on Windows bots llvm-svn: 318866
* [CodeGen] Fix vtable not receiving hidden visibility when using push(visibility)Petr Hosek2017-11-226-19/+21
| | | | | | | | | | This change should resolve https://bugs.llvm.org/show_bug.cgi?id=35022 Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D39627 llvm-svn: 318853
* [OPENMP] Do not mark captured variables as artificial in debug info.Alexey Bataev2017-11-222-6/+32
| | | | | | | Captured variables should not be marked as artificial parameters in outlined functions in debug info. llvm-svn: 318843
* [OpenMP] Adjust arguments of nvptx runtime functionsJonas Hahnfeld2017-11-221-12/+20
| | | | | | | | | | | | | In the future the compiler will analyze whether the OpenMP runtime needs to be (fully) initialized and avoid that overhead if possible. The functions already take an argument to transfer that information to the runtime, so pass in the default value 1. (This is needed for binary compatibility with libomptarget-nvptx currently being upstreamed.) Differential Revision: https://reviews.llvm.org/D40354 llvm-svn: 318836
* [OPENMP] Codegen for `target teams` directive.Alexey Bataev2017-11-221-2/+11
| | | | | | Added codegen of the clauses for `target teams` directive. llvm-svn: 318834
* [X86] Update CPUSupports code to reuse LLVM .def file [NFC]Erich Keane2017-11-221-71/+5
| | | | llvm-svn: 318815
* [Clang][OpenMP] New clang/libomptarget map interface: new function ↵George Rokos2017-11-211-53/+57
| | | | | | | | | | | signatures, clang-side This clang patch changes the __tgt_* API function signatures in preparation for the new map interface. Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits Differential revision: https://reviews.llvm.org/D40281 llvm-svn: 318789
* Add -finstrument-function-entry-bare flagHans Wennborg2017-11-211-9/+15
| | | | | | | | | | | | | | | | | This is an instrumentation flag that's similar to -finstrument-functions, but it only inserts calls on function entry, the calls are inserted post-inlining, and they don't take any arugments. This is intended for users who want to instrument function entry with minimal overhead. (-pg would be another alternative, but forces frame pointer emission and affects link flags, so is probably best left alone to be used for generating gcov data.) Differential revision: https://reviews.llvm.org/D40276 llvm-svn: 318785
* [OPENMP] Initial support for asynchronous data update, NFC.Alexey Bataev2017-11-211-3/+24
| | | | | | | | | OpenMP 5.0 introduces asynchronous data update/dependecies clauses on target data directives. Patch adds initial support for outer task regions to use task-based codegen for future async target data directives. llvm-svn: 318781
* [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs ↵Gheorghe-Teodor Bercea2017-11-212-23/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | using OpenMP device offloading Summary: This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team. This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are: - Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references; - Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained. A simple code snippet which illustrates an implicit data sharing situation is as follows: ``` #pragma omp target { // master thread only int v; #pragma omp parallel { // worker threads // use v } } ``` Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master. The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct. Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin Reviewed By: ABataev Subscribers: cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D38976 llvm-svn: 318773
OpenPOWER on IntegriCloud