| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
llvm-svn: 257802
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
keys, and PointerIntPairs where the pointee types are incomplete
out-of-line to where we have the complete type.
This is the standard pattern used throughout the AST library to address
the inherently mutually cross referenced nature of the AST.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
llvm-svn: 256612
|
|
|
|
| |
llvm-svn: 254845
|
|
|
|
| |
llvm-svn: 254703
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements the 4.5 specification for the implicit data maps. OpenMP 4.5 specification changes the default way data is captured into a target region. All the non-aggregate kinds are passed by value by default. This required activating the capturing by value during SEMA for the target region. All the non-aggregate values that can be encoded in the size of a pointer are properly casted and forwarded to the runtime library. On top of fixing the previous weird behavior for mapping pointers in nested data regions (an explicit map was always required), this also improves performance as the number of allocations/transactions to the device per non-aggregate map are reduced from two to only one - instead of passing a reference and the value, only the value passed.
Explicit maps will be added later on once firstprivate, private, and map clauses' SEMA and parsing are available.
Reviewers: hfinkel, rjmccall, ABataev
Subscribers: cfe-commits, carlo.bertolli
Differential Revision: http://reviews.llvm.org/D14940
llvm-svn: 254521
|
|
|
|
|
|
|
|
|
|
|
| |
their associated .cpp file.
Previous refactorings long long ago had split out the above categories
of classes from Stmt.h into their own header files, but failed to also
split the Stmt.cpp implementation file similarly. Do so for
readability's sake.
llvm-svn: 249131
|
|
|
|
|
|
|
|
| |
OpenMP 4.1 extends format of '#pragma omp ordered'. It adds 3 additional clauses: 'threads', 'simd' and 'depend'.
If no clause is specified, the ordered construct behaves as if the threads clause had been specified. If the threads clause is specified, the threads in the team executing the loop region execute ordered regions sequentially in the order of the loop iterations.
The loop region to which an ordered region without any clause or with a threads clause binds must have an ordered clause without the parameter specified on the corresponding loop directive.
llvm-svn: 248569
|
|
|
|
|
|
| |
Add parsing, sema analysis and codegen for 'if' clause in 'cancel' directive.
llvm-svn: 247976
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
if 'cancel' is found.
Patch improves codegen for OpenMP constructs. If the OpenMP region does not have internal 'cancel' construct, a call to 'void __kmpc_barrier()' runtime function is generated for all implicit/explicit barriers. If the region has inner 'cancel' directive, then
```
if (__kmpc_cancel_barrier())
exit from outer construct;
```
code is generated.
Also, the code for 'canellation point' directive is not generated if parent directive does not have 'cancel' directive.
llvm-svn: 247681
|
|
|
|
|
|
|
|
|
| |
This replaces the filtered generic iterator with a type-specfic one based
on dyn_cast instead of comparing the kind enum. This allows us to use
range-based for loops and eliminates casts. No functionality change
intended.
llvm-svn: 246384
|
|
|
|
|
|
|
|
| |
Adds parsing/sema analysis/serialization/deserialization for array sections in OpenMP constructs (introduced in OpenMP 4.0).
Currently it is allowed to use array sections only in OpenMP clauses that accepts list of expressions.
Differential Revision: http://reviews.llvm.org/D10732
llvm-svn: 245937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenMP 4.1 adds 3 optional modifiers to 'linear' clause.
Format of 'linear' clause has changed to:
```
linear(linear-list[ : linear-step])
```
where linear-list is one of the following
```
list
modifier(list)
```
where modifier is one of the following:
```
ref (C++)
val (C/C++)
uval (C++)
```
Patch adds parsing and sema analysis for these modifiers.
llvm-svn: 245550
|
|
|
|
|
|
| |
OpenMP 4.1 allows to use variables with reference types in all private clauses (private, firstprivate, lastprivate, linear etc.). Patch allows to use such variables and fixes codegen for linear variables with reference types.
llvm-svn: 245268
|
|
|
|
|
|
|
| |
blender uses statements expression in condition of the loop under control of the '#pragma omp parallel for'. This condition is used several times in different expressions required for codegen of the loop directive. If there are some variables defined in statement expression, it fires an assert during codegen because of redefinition of the same variables.
We have to rebuild several expression to be sure that all variables are unique.
llvm-svn: 245041
|
|
|
|
|
|
| |
OpenMP 4.1 allows to use variables with reference types in private clauses and, therefore, in init expressions of the cannonical loop forms.
llvm-svn: 244209
|
|
|
|
|
|
|
|
|
| |
This brings ASTContext closer to LLVM's Allocator concept. Ideally we
would just derive ASTContext from llvm::AllocatorBase, but that does
not work because ASTContext's allocator is mutable and we allocate using
const ASTContext& everywhere.
llvm-svn: 243972
|
|
|
|
| |
llvm-svn: 243966
|
|
|
|
|
|
|
|
| |
rather than forcing the bump pointer allocator to produce a viable
pointer. This also fixes UB when we would try to memcpy from the null
incoming StringRef.
llvm-svn: 243947
|
|
|
|
|
|
|
| |
for OpenMP 4 target data directive parsing and sema.
This commit is on behalf of Kelvin Li.
llvm-svn: 242785
|
|
|
|
|
|
|
|
|
| |
StmtRange was just a convenient wrapper for two StmtIterators before
we had real range support. This removes some of the implicit conversions
StmtRange had leading to slightly more verbose code but also should make
more obvious what's going on. No functional change intended.
llvm-svn: 242615
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some const-correctness changes snuck in here too, since they were in the
area of code I was modifying.
This seems to make Clang actually work without Bus Error on
32bit-sparc.
Follow-up patches will factor out a trailing-object helper class, to
make classes using the idiom of appending objects to other objects
easier to understand, and to ensure (with static_assert) that required
alignment guarantees continue to hold.
Differential Revision: http://reviews.llvm.org/D10272
llvm-svn: 242554
|
|
|
|
|
|
| |
Implemented parsing/sema analysis + (de)serialization.
llvm-svn: 241253
|
|
|
|
|
|
| |
Add parsing and sema analysis for 'omp cancellation point' directive.
llvm-svn: 241145
|
|
|
|
|
|
| |
Parsing and sema analysis (without support for array sections in arguments) for 'depend' clause (used in 'task' directive, OpenMP 4.0).
llvm-svn: 240409
|
|
|
|
| |
llvm-svn: 240353
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch is generated using this command:
$ tools/extra/clang-tidy/tool/run-clang-tidy.py -fix \
-checks=-*,llvm-namespace-comment -header-filter='llvm/.*|clang/.*' \
work/llvm/tools/clang
To reduce churn, not touching namespaces spanning less than 10 lines.
llvm-svn: 240270
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added parsing, sema analysis and codegen for '#pragma omp taskgroup' directive (OpenMP 4.0).
The code for directive is generated the following way:
#pragma omp taskgroup
<body>
void __kmpc_taskgroup(<loc>, thread_id);
<body>
void __kmpc_end_taskgroup(<loc>, thread_id);
llvm-svn: 240011
|
|
|
|
|
|
| |
Previously the last iteration for simd loop-based OpenMP constructs were generated as a separate code. This feature is not required and codegen is simplified.
llvm-svn: 239810
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the type isn't trivially moveable emplace can skip a potentially
expensive move. It also saves a couple of characters.
Call sites were found with the ASTMatcher + some semi-automated cleanup.
memberCallExpr(
argumentCountIs(1), callee(methodDecl(hasName("push_back"))),
on(hasType(recordDecl(has(namedDecl(hasName("emplace_back")))))),
hasArgument(0, bindTemporaryExpr(
hasType(recordDecl(hasNonTrivialDestructor())),
has(constructExpr()))),
unless(isInTemplateInstantiation()))
No functional change intended.
llvm-svn: 238601
|
|
|
|
| |
llvm-svn: 235838
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emits the following code for the clause at the beginning of the outlined function for implicit threads:
if (<not a master thread>) {
...
<thread local copy of var> = <master thread local copy of var>;
...
}
<sync point>;
Checking for a non-master thread is performed by comparing of the address of the thread local variable with the address of the master's variable. Master thread always uses original variables, so you always know the address of the variable in the master thread.
Differential Revision: http://reviews.llvm.org/D9026
llvm-svn: 235075
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#pragma omp for lastprivate(<var>)
for (i = a; i < b; ++b)
<BODY>;
This construct is translated into something like:
<last_iter> = alloca i32
<lastprivate_var> = alloca <type>
<last_iter> = 0
; No initializer for simple variables or a default constructor is called for objects.
; For arrays perform element by element initialization by the call of the default constructor.
...
OMP_FOR_START(...,<last_iter>, ..); sets <last_iter> to 1 if this is the last iteration.
<BODY>
...
OMP_FOR_END
if (<last_iter> != 0) {
<var> = <lastprivate_var> ; Update original variable with the lastprivate value.
}
call __kmpc_cancel_barrier() ; an implicit barrier to avoid possible data race.
Differential Revision: http://reviews.llvm.org/D8658
llvm-svn: 235074
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emit a code for reduction clause. Next code should be emitted for reductions:
static kmp_critical_name lock = { 0 };
void reduce_func(void *lhs[<n>], void *rhs[<n>]) {
...
*(Type<i> *)lhs[i] = RedOp<i>(*(Type<i> *)lhs[i], *(Type<i> *)rhs[i]);
...
}
... void *RedList[<n>] = {&<RHSExprs>[0], ..., &<RHSExprs>[<n> - 1]};
switch (__kmpc_reduce{_nowait}(<loc>, <gtid>, <n>, sizeof(RedList), RedList, reduce_func, &<lock>)) {
case 1:
...
<LHSExprs>[i] = RedOp<i>(*<LHSExprs>[i], *<RHSExprs>[i]);
...
__kmpc_end_reduce{_nowait}(<loc>, <gtid>, &<lock>);
break;
case 2:
...
Atomic(<LHSExprs>[i] = RedOp<i>(*<LHSExprs>[i], *<RHSExprs>[i]));
...
break;
default:
;
}
Reduction variables are a kind of a private variables, they have private copies, but initial values are chosen in accordance with the reduction operation.
Differential Revision: http://reviews.llvm.org/D8915
llvm-svn: 234583
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would waste 32 bits on alignment, use LLVM_ALIGNAS to
free that space for derived classes an place. Sadly still have to #ifdef
out MSVC 2013 because it can't align based on a sizeof expr.
No intended functionality change. New byte counts:
sizeof(before) | sizeof(after)
LabelStmt: 32 | LabelStmt: 24
SwitchStmt: 48 | SwitchStmt: 40
WhileStmt: 40 | WhileStmt: 32
DoStmt: 40 | DoStmt: 32
ForStmt: 64 | ForStmt: 56
ContinueStmt: 16 | ContinueStmt: 8
BreakStmt: 16 | BreakStmt: 8
ReturnStmt: 32 | ReturnStmt: 24
AsmStmt: 40 | AsmStmt: 32
GCCAsmStmt: 80 | GCCAsmStmt: 72
MSAsmStmt: 96 | MSAsmStmt: 88
SEHExceptStmt: 32 | SEHExceptStmt: 24
SEHFinallyStmt: 24 | SEHFinallyStmt: 16
SEHLeaveStmt: 16 | SEHLeaveStmt: 8
CapturedStmt: 32 | CapturedStmt: 24
CXXCatchStmt: 32 | CXXCatchStmt: 24
CXXForRangeStmt: 72 | CXXForRangeStmt: 64
ObjCAtFinallyStmt: 24 | ObjCAtFinallyStmt: 16
ObjCAtSynchronizedStmt: 32 | ObjCAtSynchronizedStmt: 24
ObjCAtThrowStmt: 24 | ObjCAtThrowStmt: 16
ObjCAutoreleasePoolStmt: 24 | ObjCAutoreleasePoolStmt: 16
llvm-svn: 233921
|
|
|
|
|
|
| |
Added sema checks for forms of expressions/statements allowed under control of 'atomic capture' directive + generation of helper objects for future codegen.
llvm-svn: 233785
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds atomic update codegen for the following forms of expressions:
x binop= expr;
x++;
++x;
x--;
--x;
x = x binop expr;
x = expr binop x;
If x and expr are integer and binop is associative or x is a LHS in a RHS of the assignment expression, and atomics are allowed for type of x on the target platform atomicrmw instruction is emitted.
Otherwise compare-and-swap sequence is emitted:
bb:
...
atomic load <x>
cont:
<expected> = phi [ <x>, label %bb ], [ <new_failed>, %cont ]
<desired> = <expected> binop <expr>
<res> = cmpxchg atomic &<x>, desired, expected
<new_failed> = <res>.field1;
br <res>field2, label %exit, label %cont
exit:
...
Differential Revision: http://reviews.llvm.org/D8536
llvm-svn: 233513
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there is at least one 'copyprivate' clause is associated with the single directive, the following code is generated:
```
i32 did_it = 0; \\ for 'copyprivate' clause
if(__kmpc_single(ident_t *, gtid)) {
SingleOpGen();
__kmpc_end_single(ident_t *, gtid);
did_it = 1; \\ for 'copyprivate' clause
}
<copyprivate_list>[0] = &var0;
...
<copyprivate_list>[n] = &varn;
call __kmpc_copyprivate(ident_t *, gtid, <copyprivate_list_size>,
<copyprivate_list>, <copy_func>, did_it);
...
void<copy_func>(void *LHSArg, void *RHSArg) {
Dst = (void * [n])(LHSArg);
Src = (void * [n])(RHSArg);
Dst[0] = Src[0];
... Dst[n] = Src[n];
}
```
All list items from all 'copyprivate' clauses are gathered into single <copyprivate list> (<copyprivate_list_size> is a size in bytes of this list) and <copy_func> is used to propagate values of private or threadprivate variables from the 'single' region to other implicit threads from outer 'parallel' region.
Differential Revision: http://reviews.llvm.org/D8410
llvm-svn: 232932
|
|
|
|
|
|
|
|
|
| |
The linear variable is privatized (similar to 'private') and its
value on current iteration is calculated, similar to the loop
counter variables.
Differential revision: http://reviews.llvm.org/D8375
llvm-svn: 232890
|
|
|
|
|
|
| |
Adds additional semantic analysis + generation of helper expressions for proper codegen.
llvm-svn: 232164
|
|
|
|
| |
llvm-svn: 228274
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the simplest case, which is used when no chunk_size is specified in
the schedule(static) or no 'schedule' clause is specified - the
iteration space is divided by the library into chunks that are
approximately equal in size, and at most one chunk is distributed
to each thread. In this case, we do not need an outer loop in each
thread - each thread requests once which iterations range it should
handle (using __kmpc_for_static_init runtime call) and then runs the
inner loop on this range.
Differential Revision: http://reviews.llvm.org/D5865
llvm-svn: 224233
|
|
|
|
|
|
| |
According to OpenMP standard, Section 2.12.6, atomic Construct, '#pragma omp atomic read' is allowed to be used only for expression statements of form 'v = x;', where x and v (as applicable) are both l-value expressions with scalar type. Patch adds checks for it.
llvm-svn: 222231
|
|
|
|
|
|
| |
their only use was with the AST reader, and friendship can be used to handle that. Drive-by rename of "Brac" to "Brace" for the private data members. NFC.
llvm-svn: 220428
|
|
|
|
|
|
|
| |
This patch generates some helper variables which used as a private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by default (with the default constructor, if any). In outlined function references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables and implicit barier is set by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D4752
llvm-svn: 220262
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive.
If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function.
If condition evaluates to false, the code executes serial version of the code by executing the following code:
global_thread_id.addr = alloca i32
store i32 global_thread_id, global_thread_id.addr
zero.addr = alloca i32
store i32 0, zero.addr
kmpc_serialized_parallel(loc, global_thread_id);
microtask(global_thread_id.addr, zero.addr, captured_struct/*context*/);
kmpc_end_serialized_parallel(loc, global_thread_id);
Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call.
Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).
Differential Revision: http://reviews.llvm.org/D4716
llvm-svn: 219597
|
|
|
|
|
|
| |
Includes parsing and semantic analysis for 'omp teams' directive support from OpenMP 4.0. Adds additional analysis to 'omp target' directive with 'omp teams' directive.
llvm-svn: 219385
|
|
|
|
|
|
|
|
| |
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219306
|
|
|
|
|
|
| |
Still troubles with OpenMP/parallel_firstprivate_codegen.cpp (now in ARM buildbots).
llvm-svn: 219298
|
|
|
|
|
|
|
|
| |
This patch generates some helper variables that used as private copies of the corresponding original variables inside an OpenMP 'parallel' directive. These generated variables are initialized by copy using values of the original variables (with the copy constructor, if any). For arrays, initializator is generated for single element and in the codegen procedure this initial value is automatically propagated between all elements of the private copy.
In outlined function, references to original variables are replaced by the references to these private helper variables. At the end of the initialization of the private variables an implicit barier is generated by calling __kmpc_barrier(...) runtime function to be sure that all threads were initialized using original values of the variables.
Differential Revision: http://reviews.llvm.org/D5140
llvm-svn: 219297
|
|
|
|
|
|
| |
To fix issues with test OpenMP/parallel_firstprivate_codegen.cpp
llvm-svn: 219296
|