| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The backend should only emit data sharing code for the cases where it is needed.
A new function attribute is used by Clang to enable data sharing only for the cases where OpenMP semantics require it and there are variables that need to be shared.
Reviewers: hfinkel, Hahnfeld, ABataev, carlo.bertolli, caomhin
Reviewed By: ABataev
Subscribers: cfe-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D41123
llvm-svn: 320527
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
during CodeGen.
This adds a new command line option -mprefer-vector-width to specify a preferred vector width for the vectorizers. Valid values are 'none' and unsigned integers. The driver will check that it meets those constraints. Specific supported integers will be managed by the targets in the backend.
Clang will take the value and add it as a new function attribute during CodeGen.
This represents the alternate direction proposed by Sanjay in this RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-November/118734.html
The syntax here matches gcc, though gcc treats it as an x86 specific command line argument. gcc only allows values of 128, 256, and 512. I'm not having clang check any values.
Differential Revision: https://reviews.llvm.org/D40230
llvm-svn: 320419
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Driver, frontend and LLVM codegen for HWASan.
A clone of ASan, basically.
Reviewers: kcc, pcc, alekseyshl
Subscribers: srhines, javed.absar, cfe-commits
Differential Revision: https://reviews.llvm.org/D40936
llvm-svn: 320232
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes a bug in IRGen where it generates completely broken
code for __fp16 vectors on X86. For example when the following code is
compiled:
half4 hv0, hv1, hv2; // these are vectors of __fp16.
void foo221() {
hv0 = hv1 + hv2;
}
clang generates the following IR, in which two i16 vectors are added:
@hv1 = common global <4 x i16> zeroinitializer, align 8
@hv2 = common global <4 x i16> zeroinitializer, align 8
@hv0 = common global <4 x i16> zeroinitializer, align 8
define void @foo221() {
%0 = load <4 x i16>, <4 x i16>* @hv1, align 8
%1 = load <4 x i16>, <4 x i16>* @hv2, align 8
%add = add <4 x i16> %0, %1
store <4 x i16> %add, <4 x i16>* @hv0, align 8
ret void
}
To fix the bug, this commit uses the code committed in r314056, which
modified clang to promote and truncate __fp16 vectors to and from float
vectors in the AST. It also fixes another IRGen bug where a short value
is assigned to an __fp16 variable without any integer-to-floating-point
conversion, as shown in the following example:
__fp16 a;
short b;
void foo1() {
a = b;
}
@b = common global i16 0, align 2
@a = common global i16 0, align 2
define void @foo1() #0 {
%0 = load i16, i16* @b, align 2
store i16 %0, i16* @a, align 2
ret void
}
rdar://problem/20625184
Differential Revision: https://reviews.llvm.org/D40112
llvm-svn: 320215
|
|
|
|
|
|
|
| |
Initial patch could cause trouble in the optimized code because of the
incorrectly generated lifetime intrinsics.
llvm-svn: 320191
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up to r320128. Eli pointed out that there is some gray
area in the language standard about whether the constant size is exact,
or a lower bound.
https://reviews.llvm.org/D40940
llvm-svn: 320185
|
|
|
|
|
|
| |
Host + default devices codegen for `target teams distribute` directive.
llvm-svn: 320149
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no way to apply sanitizer suppressions to ObjC blocks. A
reasonable default is to have blocks inherit their parent's sanitizer
options.
rdar://32769634
Differential Revision: https://reviews.llvm.org/D40668
llvm-svn: 320132
|
|
|
|
|
|
|
|
|
| |
Teach UBSan's bounds check to opportunistically use pass_object_size
information to check array accesses.
rdar://33272922
llvm-svn: 320128
|
|
|
|
|
|
|
|
|
|
| |
iOS/tvOS/watchOS simulator
rdar://35135215
Differential Revision: https://reviews.llvm.org/D40682
llvm-svn: 320073
|
|
|
|
|
|
|
|
|
|
|
|
| |
CreateCoercedLoad/CreateCoercedStore assumes pointer argument of
memcpy is in addr space 0, which is not correct and causes invalid
bitcasts for triple amdgcn---amdgiz.
It is fixed by using alloca addr space instead.
Differential Revision: https://reviews.llvm.org/D40806
llvm-svn: 320000
|
|
|
|
|
|
|
|
|
|
|
|
| |
The adjustment is calculated with CreatePtrDiff() which returns
the difference in (base) elements. This is passed to CreateGEP()
so make sure that the GEP base has the correct pointer type:
It needs to be a pointer to the base type, not a pointer to a
constant sized array.
Differential Revision: https://reviews.llvm.org/D40911
llvm-svn: 319931
|
|
|
|
|
|
| |
Host + default devices codegen for `teams distribute simd` directive.
llvm-svn: 319896
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 7ac28eb0a5 / r310911 ("[OpenCL] Allow targets to select address
space per type", 2017-08-15) made Basic depend on AST, introducing a
circular dependency. Break this dependency by adding the
OpenCLTypeKind enum in Basic and map from AST types to this enum in
ASTContext.
Differential Revision: https://reviews.llvm.org/D40838
llvm-svn: 319883
|
|
|
|
|
|
|
|
|
|
| |
spaces.
Though it is incorrect from point of view of OpenMP standard to have
dependent iteration space in OpenMP loops, compiler should not crash.
Patch fixes this problem.
llvm-svn: 319700
|
|
|
|
|
|
|
|
|
|
| |
distribute parallel for simd' on host
https://reviews.llvm.org/D40795
This includes regression tests for all associated clauses.
llvm-svn: 319696
|
|
|
|
|
|
| |
Initial codegen support for `distribute simd` directive.
llvm-svn: 319661
|
|
|
|
|
|
|
|
| |
This reverts commit r319413. See PR35503.
We can't use "union member" as the access type here like this.
llvm-svn: 319629
|
|
|
|
|
|
| |
Similar to D40044 and discussed in D40594.
llvm-svn: 319619
|
|
|
|
|
|
|
| |
The libm functions with LLVM intrinsic twins were moved above this blob with:
https://reviews.llvm.org/rL319593
llvm-svn: 319618
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are 20 LLVM math intrinsics that correspond to mathlib calls according to the LangRef:
http://llvm.org/docs/LangRef.html#standard-c-library-intrinsics
We were only converting 3 mathlib calls (sqrt, fma, pow) and 12 builtin calls (ceil, copysign,
fabs, floor, fma, fmax, fmin, nearbyint, pow, rint, round, trunc) to their intrinsic-equivalents.
This patch pulls the transforms together and handles all 20 cases. The switch is guarded by a
check for const-ness to make sure we're not doing the transform if errno could possibly be set by
the libcall or builtin.
Differential Revision: https://reviews.llvm.org/D40044
llvm-svn: 319593
|
|
|
|
|
|
|
|
| |
Previously we emitted `__tgt_target_teams` only for standalone teams
directives. This patch allows emit this function for all teams-based
directives.
llvm-svn: 319585
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.
This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.
Differential Revision: https://reviews.llvm.org/D40674
llvm-svn: 319505
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The basic idea behind this patch is that since in strict aliasing
mode all accesses to union members require their outermost
enclosing union objects to be specified explicitly, then for a
couple given accesses to union members of the form
p->a.b.c...
q->x.y.z...
it is known they can only alias if both p and q point to the same
union type and offset ranges of members a.b.c... and x.y.z...
overlap. Note that the actual types of the members do not matter.
Specifically, in this patch we do the following:
* Make unions to be valid TBAA base access types. This enables
generation of TBAA type descriptors for unions.
* Encode union types as structures with a single member of a
special "union member" type. Currently we do not encode
information about sizes of types, but conceptually such union
members are considered to be of the size of the whole union.
* Encode accesses to direct and indirect union members, including
member arrays, as accesses to these special members. All
accesses to members of a union thus get the same offset, which
is the offset of the union they are part of. This means the
existing LLVM TBAA machinery is able to handle such accesses
with no changes.
While this is already an improvement comparing to the current
situation, that is, representing all union accesses as may-alias
ones, there are further changes planned to complete the support
for unions. One of them is storing information about access sizes
so we can distinct accesses to non-overlapping union members,
including accesses to different elements of member arrays.
Another change is encoding type sizes in order to make it
possible to compute offsets within constant-indexed array
elements. These enhancements will be addressed with separate
patches.
Differential Revision: https://reviews.llvm.org/D39455
llvm-svn: 319413
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The -fxray-always-emit-customevents flag instructs clang to always emit
the LLVM IR for calls to the `__xray_customevent(...)` built-in
function. The default behaviour currently respects whether the function
has an `[[clang::xray_never_instrument]]` attribute, and thus not lower
the appropriate IR code for the custom event built-in.
This change allows users calling through to the
`__xray_customevent(...)` built-in to always see those calls lowered to
the corresponding LLVM IR to lay down instrumentation points for these
custom event calls.
Using this flag enables us to emit even just the user-provided custom
events even while never instrumenting the start/end of the function
where they appear. This is useful in cases where "phase markers" using
__xray_customevent(...) can have very few instructions, must never be
instrumented when entered/exited.
Reviewers: rnk, dblaikie, kpw
Subscribers: cfe-commits
Differential Revision: https://reviews.llvm.org/D40601
llvm-svn: 319388
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emit a gap area starting after the r-paren location and ending at the
start of the body for the braces-optional statements (for, for-each,
while, etc). The count for the gap area equal to the body's count. This
extends the fix in r317758.
Fixes PR35387, rdar://35570345
Testing: stage2 coverage-enabled build of clang, check-clang
llvm-svn: 319373
|
|
|
|
|
|
|
|
| |
Fixes regression introduced by r319297. MSVC environments still use SEH
unwind opcodes but they should use the Microsoft C++ EH personality, not
the mingw one.
llvm-svn: 319363
|
|
|
|
|
|
|
|
| |
directive, NFC.
Some general improvements in support of `teams distribute` directive.
llvm-svn: 319320
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a re-apply of r319294.
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
-fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319297
|
|
|
|
|
|
|
|
| |
This reverts rL319294.
The windows sanitizer does not like seh on x86.
Will re apply with None type for x86
llvm-svn: 319295
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
adds -fseh-exceptions and -fdwarf-exceptions flags
clang will check if the user has specified an exception model flag,
in the absense of specifying the exception model clang will then check
the driver default and append the model flag for that target to cc1
clang cc1 assumes dwarf is the default if none is passed
and -fno-exceptions has a higher priority then specifying the model
move __SEH__ macro definitions out of Targets into InitPreprocessor
behind the -fseh-exceptions flag
move __ARM_DWARF_EH__ macrodefinitions out of verious targets and into
InitPreprocessor behind the -fdwarf-exceptions flag and arm|thumb check
remove unused USESEHExceptions from the MinGW Driver
fold USESjLjExceptions into a new GetExceptionModel function that
gives the toolchain classes more flexibility with eh models
Reviewers: rnk, mstorsjo
Differential Revision: https://reviews.llvm.org/D39673
llvm-svn: 319294
|
|
|
|
|
|
|
|
|
|
| |
I had to reland this change in order to make the test work on windows
This change should resolve https://bugs.llvm.org/show_bug.cgi?id=35022
https://reviews.llvm.org/D39627
llvm-svn: 319269
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This fixes the following failures uncovered by D39245:
Clang :: OpenMP/task_firstprivate_codegen.cpp
Clang :: OpenMP/task_private_codegen.cpp
Clang :: OpenMP/taskloop_firstprivate_codegen.cpp
Clang :: OpenMP/taskloop_lastprivate_codegen.cpp
Clang :: OpenMP/taskloop_private_codegen.cpp
Clang :: OpenMP/taskloop_simd_firstprivate_codegen.cpp
Clang :: OpenMP/taskloop_simd_lastprivate_codegen.cpp
Clang :: OpenMP/taskloop_simd_private_codegen.cpp
Reviewers: rjmccall, ABataev, AndreyChurbanov
Reviewed By: rjmccall, ABataev
Subscribers: cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D39947
llvm-svn: 319222
|
|
|
|
|
|
|
|
| |
Currently CodeGen is calling std::sort on the features vector in TargetOptions for every function, but I don't think CodeGen should be modifying TargetOptions.
Differential Revision: https://reviews.llvm.org/D40228
llvm-svn: 319195
|
|
|
|
|
|
|
|
|
|
|
| |
These functions were defined as static members of TemplateSpecializationType.
Now they are moved to namespace level. Previously there were different
implementations for lists containing TemplateArgument and TemplateArgumentLoc,
now these implementations share the same code.
This change is a result of refactoring patch D40508. NFC.
llvm-svn: 319178
|
|
|
|
|
|
|
| |
Initial codegen for `#pragma omp distribute parallel for simd` directive
and its clauses.
llvm-svn: 319079
|
|
|
|
|
|
|
|
|
| |
constructs, NFC.
Improved handling of cancel|cancellation point directives inside
target-based for directives.
llvm-svn: 319046
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The information about access and type sizes is necessary for
producing TBAA metadata in the new size-aware format. With this
patch, D39955 and D39956 in place we should be able to change
CodeGenTBAA::createScalarTypeNode() and
CodeGenTBAA::getBaseTypeInfo() to generate metadata in the new
format under the -new-struct-path-tbaa command-line option. For
now, this new information remains unused.
Differential Revision: https://reviews.llvm.org/D40176
llvm-svn: 319012
|
|
|
|
|
|
|
|
|
| |
parallel for`.
Add support for cancel/cancellation point directives inside `target
teams distribute parallel for` directives.
llvm-svn: 318881
|
|
|
|
|
|
|
|
|
| |
parallel for directives.
Added codegen/sema support for cancel constructs in [teams] distribute
parallel for directives.
llvm-svn: 318872
|
|
|
|
|
|
|
|
| |
push(visibility)"
This reverts commit r318853: tests are failing on Windows bots
llvm-svn: 318866
|
|
|
|
|
|
|
|
|
|
| |
This change should resolve https://bugs.llvm.org/show_bug.cgi?id=35022
Patch by Jake Ehrlich
Differential Revision: https://reviews.llvm.org/D39627
llvm-svn: 318853
|
|
|
|
|
|
|
| |
Captured variables should not be marked as artificial parameters in
outlined functions in debug info.
llvm-svn: 318843
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the future the compiler will analyze whether the OpenMP
runtime needs to be (fully) initialized and avoid that overhead
if possible. The functions already take an argument to transfer
that information to the runtime, so pass in the default value 1.
(This is needed for binary compatibility with libomptarget-nvptx
currently being upstreamed.)
Differential Revision: https://reviews.llvm.org/D40354
llvm-svn: 318836
|
|
|
|
|
|
| |
Added codegen of the clauses for `target teams` directive.
llvm-svn: 318834
|
|
|
|
| |
llvm-svn: 318815
|
|
|
|
|
|
|
|
|
|
|
| |
signatures, clang-side
This clang patch changes the __tgt_* API function signatures in preparation for the new map interface.
Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits
Differential revision: https://reviews.llvm.org/D40281
llvm-svn: 318789
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an instrumentation flag that's similar to
-finstrument-functions, but it only inserts calls on function entry, the
calls are inserted post-inlining, and they don't take any arugments.
This is intended for users who want to instrument function entry with
minimal overhead.
(-pg would be another alternative, but forces frame pointer emission and
affects link flags, so is probably best left alone to be used for
generating gcov data.)
Differential revision: https://reviews.llvm.org/D40276
llvm-svn: 318785
|
|
|
|
|
|
|
|
|
| |
OpenMP 5.0 introduces asynchronous data update/dependecies clauses on
target data directives. Patch adds initial support for outer task
regions to use task-based codegen for future async target data
directives.
llvm-svn: 318781
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
using OpenMP device offloading
Summary:
This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team.
This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are:
- Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references;
- Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained.
A simple code snippet which illustrates an implicit data sharing situation is as follows:
```
#pragma omp target
{
// master thread only
int v;
#pragma omp parallel
{
// worker threads
// use v
}
}
```
Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master.
The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct.
Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled.
Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin
Reviewed By: ABataev
Subscribers: cfe-commits, jholewinski
Differential Revision: https://reviews.llvm.org/D38976
llvm-svn: 318773
|