summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen/CodeGenModule.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* [CUDA] Add -fcuda-flush-denormals-to-zero.Justin Lebar2016-04-051-0/+8
| | | | | | | | | | | | | | | | | | Summary: Setting this flag causes all functions are annotated with the "nvvm-f32ftz" = "true" attribute. In addition, we annotate the module with "nvvm-reflect-ftz" set to 0 or 1, depending on whether -cuda-flush-denormals-to-zero is set. This is read by the NVVMReflect pass. Reviewers: tra, rnk Subscribers: cfe-commits Differential Revision: http://reviews.llvm.org/D18671 llvm-svn: 265435
* Fix serialization/deserialization for __uuidofDavid Majnemer2016-03-281-1/+1
| | | | | | | | I broke this back in r264529 because I forgot to serialize the UuidAttr member. Fix this by replacing the UuidAttr with a StringRef which is properly serialized and deserialized. llvm-svn: 264562
* Use the correct alignment for uuid descriptorsDavid Majnemer2016-03-271-2/+2
| | | | | | | The _GUID_ descriptors emitted by MSVC have alignment 8 for 64-bit builds: we should do the same if the linker picks the "wrong" COMDAT. llvm-svn: 264530
* Improve the representation of CXXUuidofExprDavid Majnemer2016-03-271-1/+1
| | | | | | | | Keep a pointer to the UuidAttr that the CXXUuidofExpr corresponds to. This makes translating from __uuidof to the underlying constant a lot more straightforward. llvm-svn: 264529
* Attach profile summary information to Module.Easwaran Raman2016-03-241-0/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D18289 llvm-svn: 264342
* CodeGen: Mark functions used in vtables as unnamed_addr.Peter Collingbourne2016-03-141-6/+6
| | | | | | | | | | | This marks virtual function declarations, as well as runtime library functions __cxa_pure_virtual, __cxa_deleted_virtual and _purecall, as unnamed_addr. This will allow us to correctly form relative references to them from vtables in the relative vtable ABI. Differential Revision: http://reviews.llvm.org/D18071 llvm-svn: 263464
* [OPENMP 4.0] Codegen for 'declare reduction' construct.Alexey Bataev2016-03-041-4/+11
| | | | | | | Emit function for 'combiner' part of 'declare reduction' construct and 'initialilzer' part, if any. llvm-svn: 262699
* [OPENMP 4.0] Initial support for 'omp declare reduction' construct.Alexey Bataev2016-03-031-0/+4
| | | | | | | | | | | | | | | | | Add parsing, sema analysis and serialization/deserialization for 'declare reduction' construct. User-defined reductions are defined as #pragma omp declare reduction( reduction-identifier : typename-list : combiner ) [initializer ( initializer-expr )] These custom reductions may be used in 'reduction' clauses of OpenMP constructs. The combiner specifies how partial results can be combined into a single value. The combiner can use the special variable identifiers omp_in and omp_out that are of the type of the variables being reduced with this reduction-identifier. Each of them will denote one of the values to be combined before executing the combiner. It is assumed that the special omp_out identifier will refer to the storage that holds the resulting combined value after executing the combiner. As the initializer-expr value of a user-defined reduction is not known a priori the initializer-clause can be used to specify one. Then the contents of the initializer-clause will be used as the initializer for private copies of reduction list items where the omp_priv identifier will refer to the storage to be initialized. The special identifier omp_orig can also appear in the initializer-clause and it will refer to the storage of the original variable to be reduced. Differential Revision: http://reviews.llvm.org/D11182 llvm-svn: 262582
* [PGO] Change profile use cc1 option to handle IR level profilesRong Xu2016-03-021-4/+4
| | | | | | | | | | | | | | | | | | This patch changes cc1 option for PGO profile use from -fprofile-instr-use=<path> to -fprofile-instrument-use-path=<path>. -fprofile-instr-use=<path> is now a driver only option. In addition to decouple the cc1 option from the driver level option, this patch also enables IR level profile use. cc1 option handling now reads the profile header and sets CodeGenOpt ProfileUse (valid values are {None, Clang, LLVM} -- this is a common enum for -fprofile-instrument={}, for the profile instrumentation), and invoke the pipeline to enable the respective PGO use pass. Reviewers: silvas, davidxl Differential Revision: http://reviews.llvm.org/D17737 llvm-svn: 262515
* Serialize `#pragma detect_mismatch`.Nico Weber2016-03-021-0/+6
| | | | | | | This is like r262493, but for pragma detect_mismatch instead of pragma comment. The two pragmas have similar behavior, so use the same approach for both. llvm-svn: 262506
* [CUDA] Emit host-side 'shadows' for device-side global variablesArtem Belevich2016-03-021-12/+52
| | | | | | | | | | | | | ... and register them with CUDA runtime. This is needed for commonly used cudaMemcpy*() APIs that use address of host-side shadow to access their counterparts on device side. Fixes PR26340 Differential Revision: http://reviews.llvm.org/D17779 llvm-svn: 262498
* Serialize `#pragma comment`.Nico Weber2016-03-021-0/+19
| | | | | | | | | | | | | | `#pragma comment` was handled by Sema calling a function on ASTConsumer, and CodeGen then implementing this function and writing things to its output. Instead, introduce a PragmaCommentDecl AST node and hang one off the TranslationUnitDecl for every `#pragma comment` line, and then use the regular serialization machinery. (Since PragmaCommentDecl has codegen relevance, it's eagerly deserialized.) http://reviews.llvm.org/D17799 llvm-svn: 262493
* Add whole-program vtable optimization feature to Clang.Peter Collingbourne2016-02-241-1/+3
| | | | | | | | | This patch introduces the -fwhole-program-vtables flag, which enables the whole-program vtable optimization feature (D16795) in Clang. Differential Revision: http://reviews.llvm.org/D16821 llvm-svn: 261767
* Re-apply for the 2nd-time r259977 - [OpenMP] Reorganize code to allow ↵Samuel Antao2016-02-081-1/+15
| | | | | | | | specialized code generation for different devices. This was reverted by r260036, but was not the cause of the problem in the buildbot. llvm-svn: 260106
* Revert "Re-apply r259977 - [OpenMP] Reorganize code to allow specialized ↵Renato Golin2016-02-071-15/+1
| | | | | | | | code generation for different devices." This reverts commit r259985, as it still fails one buildbot. llvm-svn: 260036
* Re-apply r259977 - [OpenMP] Reorganize code to allow specialized code ↵Samuel Antao2016-02-061-1/+15
| | | | | | | | generation for different devices. This was reverted due to a failure in a buildbot, but it turned out the failure was unrelated. llvm-svn: 259985
* Revert r259977 - [OpenMP] Reorganize code to allow specialized code ↵Samuel Antao2016-02-061-15/+1
| | | | | | | | generation for different devices. It triggered some problem in the configuration related with zlib and exposed in the driver. llvm-svn: 259984
* [OpenMP] Reorganize code to allow specialized code generation for different ↵Samuel Antao2016-02-061-1/+15
| | | | | | | | | | | | | | | | | | | | | | | devices. Summary: Different devices may in some cases require different code generation schemes in order to implement OpenMP. This is required not only for performance reasons, but also because it may not be possible to have the current (default) implementation working for these devices. E.g. GPU's cannot implement the same scheme a target such as powerpc or x86b would use, in the sense that it does not have the ability to fork threads, instead all the threads are always executing and need to be managed by the implementation. This patch proposes a reorganization of the code in the OpenMP code generation to pave the way to have specialized implementation of OpenMP support. More than a "real" patch this is more a request for comments in order to understand if what is proposed is acceptable or if there are better/easier ways to do it. In this patch part of the common OpenMP codegen infrastructure is moved to a new file under a new namespace (CGOpenMPCommon) so it can be shared between the default implementation and the specialized one. When CGOpenMPRuntime is created, an attempt to select a specialized implementation is done. In the patch a specialization for nvptx targets is done which currently checks if the target is an OpenMP device and trap if it is not. Let me know comments suggestions you may have. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: Hahnfeld, cfe-commits, fraggamuffin, caomhin, jholewinski Differential Revision: http://reviews.llvm.org/D16784 llvm-svn: 259977
* [cfi] Safe handling of unaddressable vtable pointers (clang).Evgeniy Stepanov2016-02-031-0/+25
| | | | | | | | | | | Avoid crashing when printing diagnostics for vtable-related CFI errors. In diagnostic mode, the frontend does an additional check of the vtable pointer against the set of all known vtable addresses and lets the runtime handler know if it is safe to inspect the vtable. http://reviews.llvm.org/D16823 llvm-svn: 259716
* [CUDA] Do not allow dynamic initialization of global device side variables.Artem Belevich2016-02-021-11/+6
| | | | | | | | | | | | | | In general CUDA does not allow dynamic initialization of global device-side variables. One exception is that CUDA allows records with empty constructors as described in section E2.2.1 of CUDA 7.5 Programming guide. This patch applies initializer checks for all device-side variables. Empty constructors are accepted, but no code is generated for them. Differential Revision: http://reviews.llvm.org/D15305 llvm-svn: 259592
* Move DebugInfoKind into its own header to cut the cyclic dependency edge ↵Benjamin Kramer2016-02-021-6/+5
| | | | | | from Driver to Frontend. llvm-svn: 259489
* Use a consistent spelling for vtables.Eric Christopher2016-01-291-3/+3
| | | | llvm-svn: 259137
* [MS ABI] Allow a member pointers' converted type to changeDavid Majnemer2016-01-261-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | Member pointers in the MS ABI are tricky for a variety of reasons. The size of a member pointer is indeterminate until the program reaches a point where the representation is required to be known. However, *pointers* to member pointers may exist without knowing the pointee type's representation. In these cases, we synthesize an opaque LLVM type for the pointee type. However, we can be in a situation where the underlying member pointer's representation became known mid-way through the program. To account for this, we attempted to manicure CodeGen's type-cache so that we can replace the opaque member pointer type with the real deal while leaving the pointer types unperturbed. This, unfortunately, is a problematic approach to take as we will violate CodeGen's invariants. These violations are mostly harmless but let's do the right thing instead: invalidate the type-cache if a member pointer's LLVM representation changes. This fixes PR26313. llvm-svn: 258839
* [cfi] Cross-DSO CFI diagnostic mode (clang part)Evgeniy Stepanov2016-01-251-0/+2
| | | | | | | | | | | | | | * Runtime diagnostic data for cfi-icall changed to match the rest of cfi checks * Layout of all CFI diagnostic data changed to put Kind at the beginning. There is no ABI stability promise yet. * Call cfi_slowpath_diag instead of cfi_slowpath when needed. * Emit __cfi_check_fail function, which dispatches a CFI check faliure according to trap/recover settings of the current module. * A tiny driver change to match the way the new handlers are done in compiler-rt. llvm-svn: 258745
* [CUDA] Don't generate aliases for static extern "C" functions.Justin Lebar2016-01-251-0/+4
| | | | | | | | | | | | | | Summary: These aliases are done to support inline asm, but there's nothing we can do: NVPTX doesn't support aliases. Reviewers: tra Subscribers: cfe-commits, jhen, echristo Differential Revision: http://reviews.llvm.org/D16501 llvm-svn: 258734
* Introduce -fsanitize-stats flag.Peter Collingbourne2016-01-161-0/+9
| | | | | | | | | This is part of a new statistics gathering feature for the sanitizers. See clang/docs/SanitizerStats.rst for further info and docs. Differential Revision: http://reviews.llvm.org/D16175 llvm-svn: 257971
* PR25910: clang allows two var definitions with the same mangled nameAndrey Bokhanko2016-01-141-27/+86
| | | | | | | | | | | Proper diagnostic and resolution of mangled names' conflicts in variables. When there is a declaration and a definition using the same name but different types, we emit what is in the definition. When there are two conflicting definitions, we issue an error. Differential Revision: http://reviews.llvm.org/D15686 llvm-svn: 257754
* [MS ABI] Complete and base constructor GlobalDecls must have the same nameDavid Majnemer2016-01-081-1/+14
| | | | | | | | | | | | | | | | | Clang got itself into the situation where we mangled the same constructor twice with two different constructor types. After one of the constructors were utilized, the tag used for one of the types changed from class to struct because a class template became complete. This resulted in one of the constructor types varying from the other constructor. Instead, force "base" constructor types to "complete" if the ABI doesn't have constructor variants. This will ensure that GlobalDecls for both variants will get the same mangled name. This fixes PR26029. llvm-svn: 257205
* [Driver] Add support for -fno-builtin-foo options.Chad Rosier2016-01-061-1/+2
| | | | | | | Addresses PR4941 and rdar://6756912. http://reviews.llvm.org/D15195 llvm-svn: 256937
* [OpenMP] Reapply rL256842: [OpenMP] Offloading descriptor registration and ↵Samuel Antao2016-01-061-0/+12
| | | | | | | | | | | | device codegen. This patch attempts to fix the regressions identified when the patch was committed initially. Thanks to Michael Liao for identifying the fix in the offloading metadata generation related with side effects in evaluation of function arguments. llvm-svn: 256933
* [OpenMP] Revert rL256842: [OpenMP] Offloading descriptor registration and ↵Samuel Antao2016-01-051-12/+0
| | | | | | | | device codegen. It was causing two regression, so I'm reverting until the cause is found. llvm-svn: 256858
* [OpenMP] Offloading descriptor registration and device codegen.Samuel Antao2016-01-051-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In order to offloading work properly two things need to be in place: - a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library. - all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device. This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented. About offloading descriptor: The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type: ``` struct __tgt_offload_entry{ void *addr; char *name; int64_t size; }; ``` and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping. The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run. The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple. About target codegen: The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this: ``` !omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6} !0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4} !1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5} !2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6} !3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0} !4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3} !5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1} !6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2} ``` The fields in each metadata entry are (in sequence): Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region". Entry 2) a unique ID of the device where the input source file that contain the target region lives. Entry 3) a unique ID of the file where the input source file that contain the target region lives. Entry 4) a mangled name of the function that encloses the target region. Entries 5) and 6) line and column number where the target region was found. Entry 7) is the order the entry was emitted. Entry 2) and 3) are required to distinguish files that have the same function name. Entry 4) is required to distinguish different instances of the same declaration (usually templated ones) Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. ) This patch replaces http://reviews.llvm.org/D12306. Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits Differential Revision: http://reviews.llvm.org/D12614 llvm-svn: 256842
* Attach maximum function count to Module when using PGO mode.Easwaran Raman2015-12-171-2/+5
| | | | | | | | This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information. Differential Revision: http://reviews.llvm.org/D15163 llvm-svn: 255918
* Cross-DSO control flow integrity (Clang part).Evgeniy Stepanov2015-12-151-18/+84
| | | | | | | | | | | | | | Clang-side cross-DSO CFI. * Adds a command line flag -f[no-]sanitize-cfi-cross-dso. * Links a runtime library when enabled. * Emits __cfi_slowpath calls is bitset test fails. * Emits extra hash-based bitsets for external CFI checks. * Sets a module flag to enable __cfi_check generation during LTO. This mode does not yet support diagnostics. llvm-svn: 255694
* [WinEH] Update clang to use operand bundles on call sitesDavid Majnemer2015-12-151-2/+7
| | | | | | | | | | | This updates clang to use bundle operands to associate an invoke with the funclet which it is contained within. Depends on D15517. Differential Revision: http://reviews.llvm.org/D15518 llvm-svn: 255675
* Revert r254647.Easwaran Raman2015-12-121-3/+0
| | | | | | | | Reason: The testcase fails in many architectures. Differential Revision: http://reviews.llvm.org/D15163 llvm-svn: 255416
* Attach maximum function count to Module when using PGO modeEaswaran Raman2015-12-121-0/+3
| | | | | | | | | This sets the maximum entry count among all functions in the program to the module using module flags. This allows the optimizer to use this information. Differential Revision: http://reviews.llvm.org/D15163 llvm-svn: 255397
* Revert "[x86] Exclusion of incorrect include headers paths for MCU target"Reid Kleckner2015-12-051-52/+17
| | | | | | | | | | This reverts commit r254195. From the description, I suspect that the wrong patch was committed here, and this is causing assertion failures in EmitDeferred() when the global value ends up being a bitcast of a global. llvm-svn: 254823
* Add the `pass_object_size` attribute to clang.George Burgess IV2015-12-021-2/+5
| | | | | | | | | | | | | `pass_object_size` is our way of enabling `__builtin_object_size` to produce high quality results without requiring inlining to happen everywhere. A link to the design doc for this attribute is available at the Differential review link below. Differential Revision: http://reviews.llvm.org/D13263 llvm-svn: 254554
* Fix use-after-free when a C++ thread_local variable gets replaced (because itsRichard Smith2015-12-011-2/+2
| | | | | | | type changes when the initializer is attached). Don't hold onto the GlobalVariable*; recompute it from the VarDecl* instead. llvm-svn: 254359
* [x86] Exclusion of incorrect include headers paths for MCU targetAndrey Bokhanko2015-11-271-17/+52
| | | | | | | | Exclusion of /usr/include and /usr/local/include headers paths for MCU target. Differential Revision: http://reviews.llvm.org/D14954 llvm-svn: 254195
* [TLS on Darwin] treat all Darwin platforms in the same way.Manman Ren2015-11-111-1/+1
| | | | | | rdar://problem/9001553 llvm-svn: 252820
* Extract out a function onto CodeGenModule for getting the map ofEric Christopher2015-11-111-0/+29
| | | | | | | features for a particular function, then use it to clean up some code. llvm-svn: 252819
* [TLS on Darwin] change how we handle globals with linkonce or weak linkage.Manman Ren2015-11-111-4/+9
| | | | | | | | | | | | This is about how we handle static member of a template. Before this commit, we use internal linkage for the IR thread-local variable, which is inefficient. With this commit, we will start to follow Itanium C++ ABI. rdar://problem/23415206 Reviewed by John McCall. llvm-svn: 252814
* CodeGen: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-11-061-1/+2
| | | | | | | Make ilist iterator conversions explicit in clangCodeGen. Eventually I'll remove them everywhere. llvm-svn: 252358
* Fix crash in EmitDeclMetadata modeKeno Fischer2015-11-051-2/+4
| | | | | | | | | | | | | | | | | Summary: This fixes a bug that's easily encountered in LLDB (https://llvm.org/bugs/show_bug.cgi?id=22875). The problem here is that we mangle a name during debug info emission, but never actually emit the actual Decl, so we run into problems in EmitDeclMetadata (which assumes such a Decl exists). Fix that by just skipping metadata emissions for mangled names that don't have associated Decls. Reviewers: rjmccall Subscribers: labath, cfe-commits Differential Revision: http://reviews.llvm.org/D13959 llvm-svn: 252229
* Watch and TV OS: wire up basic ABI choicesTim Northover2015-10-301-0/+2
| | | | | | | This sets the mostly expected Darwin default ABI options for these two platforms. Active changes from these defaults for watchOS are in a later patch. llvm-svn: 251708
* Unify the ObjC entrypoint caches.John McCall2015-10-211-7/+5
| | | | llvm-svn: 250918
* [CodeGen] Remove dead code. NFC.Benjamin Kramer2015-10-151-6/+0
| | | | llvm-svn: 250418
* [CodeGen] [CodeGen] Attach function attributes to functions created inAkira Hatanaka2015-10-081-4/+5
| | | | | | | | | | | | | | | | | | CGBlocks.cpp. This commit fixes a bug in clang's code-gen where it creates the following functions but doesn't attach function attributes to them: __copy_helper_block_ __destroy_helper_block_ __Block_byref_object_copy_ __Block_byref_object_dispose_ rdar://problem/20828324 Differential Revision: http://reviews.llvm.org/D13525 llvm-svn: 249735
OpenPOWER on IntegriCloud