| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the teams directive for the NVPTX backend. It is different from the host code generation path as it:
Does not call kmpc_fork_teams. All necessary teams and threads are started upon touching the target region, when launching a CUDA kernel, and their execution is coordinated through sequential and parallel regions within the target region.
Does not call kmpc_push_num_teams even if a num_teams of thread_limit clause is present. Setting the number of teams and the thread limit is implemented by the nvptx-related runtime.
Please note that I am now passing a Clang Expr * to emitPushNumTeams instead of the originally chosen llvm::Value * type. The reason for that is that I want to avoid emitting expressions for num_teams and thread_limit if they are not needed in the target region.
http://reviews.llvm.org/D17963
llvm-svn: 265304
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is adding detection of common string literal patterns
that should not trigger warnings.
[*] Add a limit on the number of concatenated token,
[*] Add support for parenthese sequence of tokens,
[*] Add detection of valid indentation.
As an example, this code will no longer trigger a warning:
```
const char* Array[] = {
"first literal"
"indented literal"
"indented literal",
"second literal",
[...]
```
Reviewers: alexfh
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D18695
llvm-svn: 265303
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
At this point we should be able to enable IAS by default for O32 without
breaking check-all, or recursion.
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18439
llvm-svn: 265302
|
|
|
|
|
|
|
| |
- Externalize the registry.
- Update libdeps.
llvm-svn: 265301
|
|
|
|
|
|
| |
ignore_interceptors_accesses=1 for GCD tests and printf instead of NSLog).
llvm-svn: 265300
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Even though FileSpec attempted to handle both kinds of path syntaxes (posix and windows) on both
platforms, it relied on the llvm path library to do its work, whose behavior differed on
different platforms. This led to subtle differences in FileSpec behavior between platforms. This
replaces the pieces of the llvm library with our own implementations. The functions are simply
copied from llvm, with #ifdefs replaced by runtime checks for ePathSyntaxWindows.
Reviewers: zturner
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D18689
llvm-svn: 265299
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix for __declspec attributes and const=0 without space
This patch is to address 2 problems I found with Clang-tidy:modernize-use-override.
1: missing spaces on pure function decls.
Orig:
void pure() const=0
Problem:
void pure() constoverride =0
Fixed:
void pure() const override =0
2: This is ms-extension specific, but possibly applies to other attribute types. The override is placed before the attribute which doesn’t work well with declspec as this attribute can be inherited or placed before the method identifier.
Orig:
class __declspec(dllexport) X : public Y
{
void p();
};
Problem:
class override __declspec(dllexport) class X : public Y
{
void p();
};
Fixed:
class __declspec(dllexport) class X : public Y
{
void p() override;
};
Patch by Robert Bolter!
Differential Revision: http://reviews.llvm.org/D18396
llvm-svn: 265298
|
|
|
|
|
|
| |
MSVC doesn't want StringRef in an union.
llvm-svn: 265297
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds MC support for fused compare + indirect branch instructions,
ie. CRB, CGRB, CLRB, CLGRB, CIB, CGIB, CLIB, CLGIB. They aren't actually
generated yet -- this is preparation for their use for conditional
returns in the next iteration of D17339.
Author: koriakin
Differential Revision: http://reviews.llvm.org/D18742
llvm-svn: 265296
|
|
|
|
|
|
|
|
|
|
| |
This allows plugins which add AST passes to also define pragmas to do things
like only enable certain behaviour of the AST pass in files where a certain
pragma is used.
Differential Revision: http://reviews.llvm.org/D18319
llvm-svn: 265295
|
|
|
|
|
|
|
| |
This addresses the same problem as r264846 (the test not expecting the situation when two thread
hit the watchpoint simultaneously), but for a different test.
llvm-svn: 265294
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our symbol representation was redundant, and some times would get out of
sync. It had an Elf_Sym, but some fields were copied to SymbolBody.
Different parts of the code were checking the bits in SymbolBody and
others were checking Elf_Sym.
There are two general approaches to fix this:
* Copy the required information and don't store and Elf_Sym.
* Don't copy the information and always use the Elf_Smy.
The second way sounds tempting, but has a big problem: we would have to
template SymbolBody. I started doing it, but it requires templeting
*everything* and creates a bit chicken and egg problem at the driver
where we have to find ELFT before we can create an ArchiveFile for
example.
As much as possible I compared the test differences with what gold and
bfd produce to make sure they are still valid. In most cases we are just
adding hidden visibility to a local symbol, which is harmless.
In most tests this is a small speedup. The only slowdown was scylla
(1.006X). The largest speedup was clang with no --build-id, -O3 or
--gc-sections (i.e.: focus on the relocations): 1.019X.
llvm-svn: 265293
|
|
|
|
|
|
|
|
|
|
|
| |
A cross-thread sequentially consistent fence should be lowered into
z/Architecture's BCR serialization instruction, instead of causing a
fatal error in the back-end.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18644
llvm-svn: 265292
|
|
|
|
|
|
|
|
|
|
|
| |
Enable the SystemZ back-end to lower FRAMEADDR and RETURNADDR, which
previously would cause the back-end to crash. Currently, only a
frame count of zero is supported.
Author: bryanpkc
Differential Revision: http://reviews.llvm.org/D18514
llvm-svn: 265291
|
|
|
|
| |
llvm-svn: 265290
|
|
|
|
|
|
|
|
|
|
|
| |
We've reset thr->ignore_reads_and_writes, but forget to do
thr->fast_state.ClearIgnoreBit(). So ignores were not effective
reset and fast_state.ignore_bit was corrupted if signal handler
itself uses ignores.
Properly reset/restore fast_state.ignore_bit around signal handlers.
llvm-svn: 265288
|
|
|
|
|
|
|
|
|
| |
simd'.
Added parsing/semantic analysis for 'inbranch|notinbranch' clauses of
'#pragma omp declare simd' construct.
llvm-svn: 265287
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even before we build the domain the branch condition can become very
complex, especially if we have to build the complement of a lot of
equality constraints. With this patch we bail if the branch condition
has a lot of basic sets and parameters.
After this patch we now successfully compile
External/SPEC/CINT2000/186_crafty/186_crafty
with "-polly-process-unprofitable -polly-position=before-vectorizer".
llvm-svn: 265286
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a CFG is often structured we can simplify the steps performed during
domain generation. When we push domain information we can utilize the
information from a block A to build the domain of a block B, if A dominates B
and there is no loop backede on a path from A to B. When we pull domain
information we can use information from a block A to build the domain of a
block B if B post-dominates A. This patch implements both ideas and thereby
simplifies domains that were not simplified by isl. For the FINAL basic block
in test/ScopInfo/complex-successor-structure-3.ll we used to build a universe
set with 81 basic sets. Now it actually is represented as universe set.
While the initial idea to utilize the graph structure depended on the
dominator and post-dominator tree we can use the available region
information as a coarse grained replacement. To this end we push the
region entry domain to the region exit and pull it from the region
entry for the region exit if applicable.
With this patch we now successfully compile
External/SPEC/CINT2006/400_perlbench/400_perlbench
and
SingleSource/Benchmarks/Adobe-C++/loop_unroll.
Differential Revision: http://reviews.llvm.org/D18450
llvm-svn: 265285
|
|
|
|
| |
llvm-svn: 265284
|
|
|
|
|
|
|
|
|
| |
Implemented truncstore for KNL and skylake-avx512.
Covered vectors from v2i1 to v64i1. We save the value in bits (not in bytes) - v32i1 is saved in 4 bytes.
Differential Revision: http://reviews.llvm.org/D18740
llvm-svn: 265283
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove a few old FIXMEs from the original commit of the Metadata/Value
split in r223802. These are commented out assertions to the effect that
calls between mapValue and mapMetadata never return nullptr.
(The only behaviour change is that Mapper::mapSimpleMetadata memoizes
the nullptr return.)
When I originally rewrote the mapping code, I thought we could be
stricter in the new metadata hierarchy and never return nullptr when
RF_NullMapMissingGlobalValues was off. It's still not entirely clear to
me why these assertions failed (a few months ago, I had a theory that I
forgot to write down, but that's helping no one).
Understood or not, I no longer see how these commented-out assertions
would be useful. I'm relegating them to the annals of source control
before making significant changes to ValueMapper.cpp.
llvm-svn: 265282
|
|
|
|
|
|
|
| |
Each DISubprogram with isDefinition : true must
belong to a compile unit.
llvm-svn: 265281
|
|
|
|
|
|
|
|
| |
If a loop has no exiting blocks the region covering we use during
schedule genertion might not cover that loop properly. For now we bail
out as we would not optimize these loops anyway.
llvm-svn: 265280
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RAUW support on MDNode usually requires an extra allocation for
ReplaceableMetadataImpl. This is only strictly necessary if there are
tracking references to the MDNode. Make the construction of
ReplaceableMetadataImpl lazy, so that we don't get allocations if we
don't need them.
Since MDNode::isResolved now checks MDNode::isTemporary and
MDNode::NumUnresolved instead of whether a ReplaceableMetadataImpl is
allocated, the internal changes are intrusive (at various internal
checkpoints, isResolved now has a different answer).
However, there should be no real functionality change here; just
slightly lazier allocation behaviour. The external semantics should be
identical.
llvm-svn: 265279
|
|
|
|
| |
llvm-svn: 265278
|
|
|
|
| |
llvm-svn: 265277
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds an assertion to maintain the property from r265273. When
Mapper::mapSimpleMetadata calls Mapper::mapValue, it should not find its
way back to mapMetadataImpl. This guarantees that mapSimpleMetadata is
not involved in any recursion.
Since Mapper::mapValue calls out to arbitrary materializers, we need to
save a bit on the ValueMap to make this assertion effective.
There should be no functionality change here. This co-recursion should
already have been impossible.
llvm-svn: 265276
|
|
|
|
|
|
| |
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19726
llvm-svn: 265275
|
|
|
|
| |
llvm-svn: 265274
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main change is to delay materializing GlobalValue initializers from
Mapper::mapValue until Mapper::~Mapper. This effectively removes all
recursion from mapSimplifiedMetadata, as promised in r265270.
mapSimplifiedMetadata calls mapValue for ConstantAsMetadata nodes to
find the mapped constant, and now it shouldn't be possible for mapValue
to indirectly re-invoke mapMetadata. I'll add an assertion to that
effect in a follow-up (separated so that the assertion can easily be
reverted independently, if it comes to that).
This a step toward a broader goal: converting Mapper::mapMetadataImpl
from a recursive to an iterative algorithm.
When a BlockAddress points at a BasicBlock inside an unmaterialized
function body, we need to delay it until the function body is
materialized in Mapper::~Mapper. This commit creates a temporary
BasicBlock and returns a new BlockAddress, then RAUWs the BasicBlock
once it is known. This situation should be extremely rare since a
BlockAddress is usually used from within the function it's referencing
(and BlockAddress itself is rare).
There should be no observable functionality change.
llvm-svn: 265273
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit r265260, as it caused the following 'make check-polly'
failures:
Polly :: ScopDetect/index_from_unpredictable_loop.ll
Polly :: ScopInfo/multiple_exiting_blocks.ll
Polly :: ScopInfo/multiple_exiting_blocks_two_loop.ll
Polly :: ScopInfo/schedule-const-post-dominator-walk-2.ll
Polly :: ScopInfo/schedule-const-post-dominator-walk.ll
Polly :: ScopInfo/switch-5.ll
llvm-svn: 265272
|
|
|
|
|
|
|
| |
Don't require TLI for SinkCmpExpression, like it wasn't before
r265264.
llvm-svn: 265271
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Split out a helper for mapping metadata without operands. This is any
metadata that is not an MDNode, and any MDNode where the answer is known
without looking at operands.
Through some weird twists, this function is co-recursive:
mapSimpleMetadata
=> MapValue
=> materializeInitFor
=> linkFunctionBody
=> RemapInstructions
=> MapMetadata
=> mapSimpleMetadata
I plan to break the recursion in a follow-up.
llvm-svn: 265270
|
|
|
|
| |
llvm-svn: 265269
|
|
|
|
|
|
|
| |
Remove a bunch of boilerplate from ValueMapper.cpp by using a new
file-local class called Mapper.
llvm-svn: 265268
|
|
|
|
| |
llvm-svn: 265267
|
|
|
|
|
|
|
|
|
|
| |
Add support for lowering with the MOVMSK instruction to extract vector element signbits to a GPR.
This is an early step towards more optimal handling of vector comparison results.
Differential Revision: http://reviews.llvm.org/D18741
llvm-svn: 265266
|
|
|
|
|
|
|
| |
The case where there was no TargetLowering was not handled,
leading to null pointer dereferences.
llvm-svn: 265265
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sinking comparisons in CGP can undo the job of hoisting them done
earlier by LICM, and soft-FP makes this an expensive mistake.
A common pattern that produces floating point comparisons uniform
over a loop is an explicit check for division by zero. If the divisor
is hoisted out of the loop, the comparison can also be, but hoisting
the function that unwinds is never legal, since it may cause side
effects in the loop body prior to the unwinding to not be executed.
Differential Revision: http://reviews.llvm.org/D18744
llvm-svn: 265264
|
|
|
|
|
|
|
|
| |
Tidied up comments, stripped trailing whitespace, split apart nodes that aren't related.
No change in ordering although there is definitely some scope for it.
llvm-svn: 265263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Floating point intrinsics in LLVM are generally not speculatively
executed, since most of them are defined to behave the same as libm
functions, which set errno.
However, the only error that can happen when executing ceil, floor,
nearbyint, rint and round libm functions per POSIX.1-2001 is -ERANGE,
and that requires the maximum value of the exponent to be smaller
than the number of mantissa bits, which is not the case with any of
the floating point types supported by LLVM.
The trunc and copysign functions never set errno per per POSIX.1-2001.
Differential Revision: http://reviews.llvm.org/D18643
llvm-svn: 265262
|
|
|
|
|
|
|
|
|
| |
If an exit PHI is written and also read in the SCoP we should not create two
SAI objects but only one. As the read is only modeled to ensure OpenMP code
generation knows about it we can simply use the EXIT_PHI MemoryKind for both
accesses.
llvm-svn: 265261
|
|
|
|
|
|
|
|
| |
If a loop has no exiting blocks the region covering we use during
schedule genertion might not cover that loop properly. For now we bail
out as we would not optimize these loops anyway.
llvm-svn: 265260
|
|
|
|
|
|
|
|
|
|
| |
Implemented load+{sign|zero}_extend for i1 vectors
Fixed failures in i1 vector load.
Covered loading of v2i1, v4i1, v8i1, v16i1, v32i1, v64i1 vectors for KNL and SKX.
Differential Revision: http://reviews.llvm.org/D18737
llvm-svn: 265259
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So, there are some cases when the IR Linker produces a broken
module (which doesn't pass the verifier) and we end up asserting
inside the verifier. I think it's always a bug producing a module
which does not pass the verifier but there are some cases in which
people can live with the broken module (e.g. if only DebugInfo
metadata are broken). The gold plugin has something similar.
This commit is motivated by a situation I found in the
wild. It seems that somebody else discovered it independently
and reported in PR24923.
llvm-svn: 265258
|
|
|
|
|
|
|
| |
The test was failing on some release build because the basic block names
I was expecting weren't printed.
llvm-svn: 265257
|
|
|
|
|
|
|
|
| |
We already got this right, but it never hurts adding another
test, in case we'll change the handling in the future, to ensure
we don't break it.
llvm-svn: 265256
|
|
|
|
| |
llvm-svn: 265255
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we create a file called .lto.bc. In UNIX,
ls(1) by default doesn't show up files starting with
a dot, as they're (only by convention) hidden.
This makes the output of -save-temps a little bit
hard to find. Use "a.out.lto.bc" instead if the
output file is not specified.
While here, change an existing -save-temps test to
exercise this more interesting behaviour.
llvm-svn: 265254
|