| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In order to offloading work properly two things need to be in place:
- a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library.
- all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device.
This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented.
About offloading descriptor:
The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type:
```
struct __tgt_offload_entry{
void *addr;
char *name;
int64_t size;
};
```
and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping.
The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run.
The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple.
About target codegen:
The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this:
```
!omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6}
!0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4}
!1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5}
!2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6}
!3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0}
!4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3}
!5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1}
!6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2}
```
The fields in each metadata entry are (in sequence):
Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region".
Entry 2) a unique ID of the device where the input source file that contain the target region lives.
Entry 3) a unique ID of the file where the input source file that contain the target region lives.
Entry 4) a mangled name of the function that encloses the target region.
Entries 5) and 6) line and column number where the target region was found.
Entry 7) is the order the entry was emitted.
Entry 2) and 3) are required to distinguish files that have the same function name.
Entry 4) is required to distinguish different instances of the same declaration (usually templated ones)
Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. )
This patch replaces http://reviews.llvm.org/D12306.
Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao
Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits
Differential Revision: http://reviews.llvm.org/D12614
llvm-svn: 256842
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for types which are used as pointees in PointerUnions, PointerIntPairs,
and DenseMap pointer keys.
This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.
I think this is the last patch for getting Clang clean here!!!
llvm-svn: 256615
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Use clang-tidy to simplify boolean conditional return statements
Reviewers: alexfh
Subscribers: alexfh, cfe-commits
Patch by Richard Thomson!
Differential Revision: http://reviews.llvm.org/D10016
llvm-svn: 256496
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[ Copied from https://llvm.org/bugs/show_bug.cgi?id=25597 ]
Clang support for DragonFly BSD is lagging a bit, resulting in poor
support for c++.
DragonFlyBSD is unique in that it has two base compilers. At the time
of the last Clang update for DragonFly, these compilers were GCC 4.4 and
GCC 4.7 (default).
With DragonFly Release 4.2, GCC 4.4 was replaced with GCC 5.0, partially
because the C++11 support of GCC 4.7 was incomplete. The DragonFly
project will Release version 4.4 soon.
This patch updates the Clang driver to use libstdc++ from GCC 5.2 The
support for falling back to the alternate compiler was removed for two
reasons:
1) The last release to use GCC 4.7 is DF 4.0 which has already reached EOL
2) GCC 4.7 libstdc++ is insufficient for many "ports"
Therefore, I think it is reasonable that the development version of
clang expects GCC 5.2 to be in place and not try to fall back to another
compiler.
The attached patch will do this. The Tools.cpp file was signficantly
modified to fix the linking which had been changed somewhere along the
line. The rest of the changes should be self-explanatory.
Reviewers: joerg, rsmith, davide
Subscribers: jrmarino, davide, cfe-commits
Differential Revision: http://reviews.llvm.org/D15166
llvm-svn: 256467
|
|
|
|
|
|
| |
r255281.
llvm-svn: 256396
|
|
|
|
|
|
|
|
|
|
|
|
| |
The /Brepro flag controls whether or not the compiler should embed
timestamps into the object file. Object files which do not embed
timestamps are not suitable for incremental linking but are suitable for
hermetic build systems and staged self-hosts of clang.
A normal clang spelling of this flag has been added,
-mincremental-linker-compatible.
llvm-svn: 256204
|
|
|
|
|
|
|
|
|
| |
Reapplies r256063, except instead of frugally re-using an LLVM enum,
we define a Clang enum, to avoid exposing too much LLVM interface.
Differential Revision: http://reviews.llvm.org/D15650
llvm-svn: 256078
|
|
|
|
| |
llvm-svn: 256066
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D15650
llvm-svn: 256063
|
|
|
|
| |
llvm-svn: 255838
|
|
|
|
|
|
|
|
| |
Exclusion of /usr/include and /usr/local/include headers paths for MCU target.
Differential Revision: http://reviews.llvm.org/D14954
llvm-svn: 255766
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso.
* Links a runtime library when enabled.
* Emits __cfi_slowpath calls is bitset test fails.
* Emits extra hash-based bitsets for external CFI checks.
* Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The current default is to create the preamble on the first reparse, aka
second parse. This is useful for clients that do not want to block when
opening a file because serializing the preamble takes a bit of time.
However, this makes the reparse much more expensive and that may be on the
critical path as it's the first interaction a user has with the source code.
YouCompleteMe currently optimizes for the first code interaction by parsing
the file twice when loaded. That's just unnecessarily slow and this flag
helps to avoid that.
Reviewers: doug.gregor, klimek
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15490
llvm-svn: 255635
|
|
|
|
|
|
|
|
|
|
| |
nested) for all the platforms except PS4.
For PS4, generate explicit import for anonymous namespaces and mark it by DW_AT_artificial attribute.
Differential Revision: http://reviews.llvm.org/D12624
llvm-svn: 255281
|
|
|
|
|
|
|
|
| |
Module file extensions are likely to need access to
Sema/Preprocessor/ASTContext, and cannot get it through other
sources.
llvm-svn: 255065
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When attempting to map a source into a given level of macro expansion,
this code was ignoring the possibility that the start and end of the
range might take wildly different paths through the tree of macro
expansions. It was assuming that the begin spelling location would
always precede the end spelling location, which is false. A macro can
easily transpose its arguments.
This also fixes a related issue where there are extra macro arguments
between the begin location and the end location. In this situation, we
now highlight the entire macro invocation.
Pair programmed with Richard Smith.
Fixes PR12818.
llvm-svn: 254981
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Adds new option -fthinlto-index=<file> to invoke the LTO pipeline
along with function importing via clang using the supplied function
summary index file. This supports invoking the parallel ThinLTO
backend processes in a distributed build environment via clang.
Additionally, this causes the module linker to be invoked on the bitcode
file being compiled to perform any necessary promotion and renaming of
locals that are exported via the function summary index file.
Add a couple tests that confirm we get expected errors when we try to
use the new option on a file that isn't bitcode, or specify an invalid
index file. The tests also confirm that we trigger the expected function
import pass.
Depends on D15024
Reviewers: joker.eph, dexonsmith
Subscribers: joker.eph, davidxl, cfe-commits
Differential Revision: http://reviews.llvm.org/D15025
llvm-svn: 254927
|
|
|
|
|
|
|
| |
than reusing the "overridden buffer" mechanism. This will allow us to make
embedded files and overridden files behave differently in future.
llvm-svn: 254121
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This flag causes all files that were read by the compilation to be embedded
into a produced module file. This is useful for distributed build systems that
use an include scanning system to determine which files are "needed" by a
compilation, and only provide those files to remote compilation workers. Since
using a module can require any file that is part of that module (or anything it
transitively includes), files that are not found by an include scanner can be
required in a regular build using explicit modules. With this flag, only files
that are actually referenced by transitively-#included files are required to be
present on the build machine.
llvm-svn: 253950
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Re-apply patch after bug fixing)
This diff makes sure that the driver does not pass
-fomit-frame-pointer or -momit-leaf-frame-pointer to
the frontend when -pg is used. Currently, clang gives
an error if -fomit-frame-pointer is used in combination
with -pg, but -momit-leaf-frame-pointer was forgotten.
Also, disable frame pointer elimination in the frontend
when -pg is set.
Patch by Stefan Kempf.
llvm-svn: 253886
|
|
|
|
| |
llvm-svn: 253851
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This diff makes sure that the driver does not pass
-fomit-frame-pointer or -momit-leaf-frame-pointer to
the frontend when -pg is used. Currently, clang gives
an error if -fomit-frame-pointer is used in combination
with -pg, but -momit-leaf-frame-pointer was forgotten.
Also, disable frame pointer elimination in the frontend
when -pg is set.
Patch by Stefan Kempf.
llvm-svn: 253846
|
|
|
|
|
|
|
| |
Fixes crash when passing '-gmodules' in the compiler options.
rdar://23588717
llvm-svn: 253645
|
|
|
|
|
|
|
|
| |
FIXME: Attributes.inc may be an independent target.
Differential Revision: http://reviews.llvm.org/D14760
llvm-svn: 253554
|
|
|
|
|
|
|
|
|
|
|
|
| |
This provides both a more uniform interface and makes libclang behave like
clang tooling wrt relative paths against argv[0]. This is necessary for
finding paths to a c++ standard library relative to a clang binary given
in a compilation database. It can also be used to find paths relative to
libclang.so if the full path to it is passed in.
Differential Revision: http://reviews.llvm.org/D14695
llvm-svn: 253466
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently clang requires several additional command
line options in order to enable new features needed
during CUDA compilation. This patch makes these
options default.
* Automatically include cuda_runtime.h if we've found
a valid CUDA installation.
* Disable automatic CUDA header inclusion during unit tests.
* Added test case for command line construction.
* Enabled target overloads and relaxed call checks that are
needed in order to include CUDA headers.
* Added CUDA-7.5 installation path to the CUDA installation search list.
* Define __CUDA__ macro to indicate CUDA compilation.
llvm-svn: 253389
|
|
|
|
|
|
|
|
|
| |
This reverts commit r253269.
This leads to assert / segfault triggering on the following reduced example:
float foo(float U, float base, float cell) { return (U = 2 * base) - cell; }
llvm-svn: 253337
|
|
|
|
|
|
| |
Differential Revision: D14200
llvm-svn: 253269
|
|
|
|
| |
llvm-svn: 253178
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This failed to solve the problem it was aimed at, and introduced just as many
issues as it resolved. Realistically, we need to deal with the possibility that
multiple modules might define different internal linkage symbols with the same
name, and this isn't a problem unless two such symbols are simultaneously
visible.
The case where two modules define equivalent internal linkage symbols is
handled by r252063: if lookup finds multiple sufficiently-similar entities from
different modules, we just pick one of them as an extension (but we keep them
separate).
llvm-svn: 252957
|
|
|
|
|
|
| |
This was an accidental regression from the MRC __weak patch.
llvm-svn: 252668
|
|
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D14394
llvm-svn: 252501
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The -meabi flag to control LLVM EABI version.
Without '-meabi' or with '-meabi default' imply LLVM triple default.
With '-meabi gnu' sets EABI GNU.
With '-meabi 4' or '-meabi 5' set EABI version 4 and 5 respectively.
A similar patch was introduced in LLVM.
Patch by Vinicius Tinti.
llvm-svn: 252463
|
|
|
|
|
|
| |
rdar://problem/23415863
llvm-svn: 252187
|
|
|
|
| |
llvm-svn: 252128
|
|
|
|
|
|
|
|
|
| |
we can't load that file due to a configuration mismatch, and implicit module
building is disabled, and the user turns off the error-by-default warning for
that situation, then fall back to textual inclusion for the module rather than
giving an error if any of its headers are included.
llvm-svn: 252114
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce the notion of a module file extension, which introduces
additional information into a module file at the time it is built that
can then be queried when the module file is read. Module file
extensions are identified by a block name (which must be unique to the
extension) and can write any bitstream records into their own
extension block within the module file. When a module file is loaded,
any extension blocks are matched up with module file extension
readers, that are per-module-file and are given access to the input
bitstream.
Note that module file extensions can only be introduced by
programmatic clients that have access to the CompilerInvocation. There
is only one such extension at the moment, which is used for testing
the module file extension harness. As a future direction, one could
imagine allowing the plugin mechanism to introduce new module file
extensions.
llvm-svn: 251955
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A 'readonly' Objective-C property declared in the primary class can
effectively be shadowed by a 'readwrite' property declared within an
extension of that class, so long as the types and attributes of the
two property declarations are compatible.
Previously, this functionality was implemented by back-patching the
original 'readonly' property to make it 'readwrite', destroying source
information and causing some hideously redundant, incorrect
code. Simplify the implementation to express how this should actually
be modeled: as a separate property declaration in the extension that
shadows (via the name lookup rules) the declaration in the primary
class. While here, correct some broken Fix-Its, eliminate a pile of
redundant code, clean up the ARC migrator's handling of properties
declared in extensions, and fix debug info's naming of methods that
come from categories.
A wonderous side effect of doing this write is that it eliminates the
"AddedObjCPropertyInClassExtension" method from the AST mutation
listener, which in turn eliminates the last place where we rewrite
entire declarations in a chained PCH file or a module file. This
change (which fixes rdar://problem/18475765) will allow us to
eliminate the rewritten-decls logic from the serialization library,
and fixes a crash (rdar://problem/23247794) illustrated by the
test/PCH/chain-categories.m example.
llvm-svn: 251874
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This came up in a boost build, which apparently uses PTH. This was
broken in r187619 when we migrated it to uses llvm::fs instead of raw
stat calls.
Constructing a test case with a hash table collision in-tree is tough.
Instead, I have a pending change to OnDiskChainedHashTable that asserts
that the reported length of the data agrees with the data actually
written. All of the existing in-tree tests find the bug with this
assert.
llvm-svn: 251828
|
|
|
|
|
|
|
| |
This reduces the number of .cpp files needed to be rebuilt after
touching OnDiskHashTable from 120 to 21 for me.
llvm-svn: 251810
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linking options for particular file depend on the option that specifies the file.
Currently there are two:
* -mlink-bitcode-file links in complete content of the specified file.
* -mlink-cuda-bitcode links in only the symbols needed by current TU.
Linked symbols are internalized. This bitcode linking mode is used to
link device-specific bitcode provided by CUDA.
Files are linked in order they are specified on command line.
-mlink-cuda-bitcode replaces -fcuda-uses-libdevice flag.
Differential Revision: http://reviews.llvm.org/D13913
llvm-svn: 251427
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, __weak was silently accepted and ignored in MRC mode.
That makes this a potentially source-breaking change that we have to
roll out cautiously. Accordingly, for the time being, actual support
for __weak references in MRC is experimental, and the compiler will
reject attempts to actually form such references. The intent is to
eventually enable the feature by default in all non-GC modes.
(It is, of course, incompatible with ObjC GC's interpretation of
__weak.)
If you like, you can enable this feature with
-Xclang -fobjc-weak
but like any -Xclang option, this option may be removed at any point,
e.g. if/when it is eventually enabled by default.
This patch also enables the use of the ARC __unsafe_unretained qualifier
in MRC. Unlike __weak, this is being enabled immediately. Since
variables are essentially __unsafe_unretained by default in MRC,
the only practical uses are (1) communication and (2) changing the
default behavior of by-value block capture.
As an implementation matter, this means that the ObjC ownership
qualifiers may appear in any ObjC language mode, and so this patch
removes a number of checks for getLangOpts().ObjCAutoRefCount
that were guarding the processing of these qualifiers. I don't
expect this to be a significant drain on performance; it may even
be faster to just check for these qualifiers directly on a type
(since it's probably in a register anyway) than to do N dependent
loads to grab the LangOptions.
rdar://9674298
llvm-svn: 251041
|
|
|
|
|
|
| |
the implementation is incomplete.
llvm-svn: 250982
|
|
|
|
|
|
|
| |
Add -fcoroutines flag (just for -cc1 for now) to enable the feature. Early
indications are that this will be part of -std=c++1z.
llvm-svn: 250980
|
|
|
|
|
|
|
|
| |
size when it really expects a StringRef and a normally optional bool argument.
The pointer was being implicitly converted to a StringRef and the size was being passed into the bool. Since the bool has a default value normally, no one noticed that the wrong number of arguments was given.
llvm-svn: 250977
|
|
|
|
| |
llvm-svn: 250976
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ELF symbol visibilities are:
- internal: Not visibile across DSOs, cannot pass address across DSOs
- hidden: Not visibile across DSOs, can be called indirectly
- default: Usually visible across DSOs, possibly interposable
- protected: Visible across DSOs, not interposable
LLVM only supports the latter 3 visibilities. Internal visibility is in
theory useful, as it allows you to assume that the caller is maintaining
a PIC register for you in %ebx, or in some other pre-arranged location.
As far as LLVM is concerned, this isn't worth the trouble. Using hidden
visibility is always correct, so we can just do that.
Resolves PR9183.
llvm-svn: 250954
|
|
|
|
|
|
|
|
|
|
| |
Summary: It breaks the build for the ASTMatchers
Subscribers: klimek, cfe-commits
Differential Revision: http://reviews.llvm.org/D13893
llvm-svn: 250827
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Replace empty bodies of default constructors and destructors with '= default'.
Reviewers: bkramer, klimek
Subscribers: klimek, alexfh, cfe-commits
Differential Revision: http://reviews.llvm.org/D13890
llvm-svn: 250822
|
|
|
|
|
|
| |
Reported by: Kim Grasman!
llvm-svn: 250605
|