summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [cfi] Cross-DSO CFI diagnostic mode (clang part)Evgeniy Stepanov2016-01-254-46/+154
| | | | | | | | | | | | | | * Runtime diagnostic data for cfi-icall changed to match the rest of cfi checks * Layout of all CFI diagnostic data changed to put Kind at the beginning. There is no ABI stability promise yet. * Call cfi_slowpath_diag instead of cfi_slowpath when needed. * Emit __cfi_check_fail function, which dispatches a CFI check faliure according to trap/recover settings of the current module. * A tiny driver change to match the way the new handlers are done in compiler-rt. llvm-svn: 258745
* Update comments to match the implementation.Manman Ren2016-01-251-0/+1
| | | | llvm-svn: 258735
* [CUDA] Don't generate aliases for static extern "C" functions.Justin Lebar2016-01-251-0/+4
| | | | | | | | | | | | | | Summary: These aliases are done to support inline asm, but there's nothing we can do: NVPTX doesn't support aliases. Reviewers: tra Subscribers: cfe-commits, jhen, echristo Differential Revision: http://reviews.llvm.org/D16501 llvm-svn: 258734
* [PGO] Windows buildbot failure fix. [NFC]Betul Buyukkurt2016-01-241-2/+3
| | | | llvm-svn: 258652
* Clang changes for value profilingBetul Buyukkurt2016-01-233-6/+94
| | | | | | Differential Revision: http://reviews.llvm.org/D8940 llvm-svn: 258650
* [CUDA] Make printf work.Justin Lebar2016-01-234-0/+137
| | | | | | | | | | | | | | | | | | Summary: The code in CGCUDACall is largely based on a patch written by Eli Bendersky: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html That patch implemented an LLVM pass lowering printf to vprintf; this one does something similar, but in Clang codegen. Reviewers: echristo Subscribers: cfe-commits, jhen, tra, majnemer Differential Revision: http://reviews.llvm.org/D16372 llvm-svn: 258642
* [cfi] Do not emit bit set entry for available_externally vtables.Evgeniy Stepanov2016-01-231-1/+2
| | | | | | | | In the Itanium ABI, vtable may be emitted speculatively as an available_externally global. Such vtable may not be present at the link time and should not have a corresponding CFI bit set entry. llvm-svn: 258596
* Module Debugging: Canonicalize the file names used as PCH module namesAdrian Prantl2016-01-221-4/+8
| | | | | | | | | | | by stripping the path. Follow-up to r258555. This is safe because only one PCH per CU is currently supported for module debugging. rdar://problem/24301262 llvm-svn: 258582
* AMDGPU: Rename builtins to use amdgcn prefixMatt Arsenault2016-01-221-25/+39
| | | | | | | | | Keep the ones still used by libclc around for now. Emit the new amdgcn intrinsic name if not targeting r600, in which case the old AMDGPU name is still used. llvm-svn: 258560
* Module debugging: Create a parent DIModule with the PCH name for typesAdrian Prantl2016-01-223-1/+21
| | | | | | | | | emitted into a precompiled header to mirror the debug info emitted for object files importing the PCH. rdar://problem/24290667 llvm-svn: 258555
* Fix the build by using the correct suffix for 64 bit literalsAdrian Prantl2016-01-222-2/+2
| | | | llvm-svn: 258531
* Fix a typo in r258507 and change the PCH dwoid constant to ~1UL.Adrian Prantl2016-01-222-2/+2
| | | | | | rdar://problem/24290667 llvm-svn: 258519
* Module Debugging: Use a nonzero DWO id for precompiled headers.Adrian Prantl2016-01-222-2/+9
| | | | | | | | | | | | | PCH files don't have a module signature and LLVM uses a nonzero DWO id as an indicator for skeleton / module CUs. This change pins the DWO id for PCH files to a known constant value. The correct long-term solution here is to implement a module signature that is an actual dterministic hash (at the moment module signatures are just random nonzero numbers) and then enable this for PCH files as well. <rdar://problem/24290667> llvm-svn: 258507
* [MSVC Compat] Don't provide /volatile:ms semantics to types > pointerDavid Majnemer2016-01-221-3/+16
| | | | | | | | | | | | Volatile loads of type wider than a pointer get split by MSVC because the base x86 ISA doesn't provide loads which are wider than pointer width. LLVM assumes that it can emit an cmpxchg8b but this is problematic if the memory is in a CONST memory segment. Instead, provide behavior compatible with MSVC: split loads wider than a pointer. llvm-svn: 258506
* [OPENMP] Generalize codegen for 'sections'-based directive.Alexey Bataev2016-01-222-143/+105
| | | | | | If 'sections' directive has only one sub-section, the code for 'single'-based directive was emitted. Removed this codegen, because it causes crashes in different cases. llvm-svn: 258495
* [Coverage] Reduce complexity of adding function mapping recordsVedant Kumar2016-01-212-4/+7
| | | | | | | | | | Replace a string append operation in addFunctionMappingRecord with a vector append. The existing behavior is quadratic in the worst case: this patch makes it linear. Differential Revision: http://reviews.llvm.org/D16395 llvm-svn: 258424
* [OPENMP] Fix crash on reduction for complex variables.Alexey Bataev2016-01-213-17/+21
| | | | | | reworked codegen for reduction operation for complex types to avoid crash llvm-svn: 258394
* [OPENMP 4.0] Fix for codegen of 'cancel' directive within 'sections' directive.Alexey Bataev2016-01-202-3/+6
| | | | | | Allow to emit code for 'cancel' directive within 'sections' directive with single sub-section. llvm-svn: 258307
* Module Debugging: Fine-tune the condition that determines whether a typeAdrian Prantl2016-01-201-1/+1
| | | | | | | | | | | | | can be found in a module. There are externally visible anonymous types that can be found: typedef struct { } s; // I can be found via the typedef. There are anonymous internal types that can be found: namespace { struct s {}; } // I can be found by name. rdar://problem/24199640 llvm-svn: 258272
* Reference the updated function name /NFCXinliang David Li2016-01-201-1/+1
| | | | llvm-svn: 258261
* Module Debugging: Don't emit external type references to anonymous types.Adrian Prantl2016-01-191-2/+3
| | | | | | | | Even if they exist in the module, they can't be matched with the forward declaration in the object file. <rdar://problem/24199640> llvm-svn: 258251
* Module Debugging: Make sure that anonymous tag decls that define globalAdrian Prantl2016-01-191-7/+6
| | | | | | | | | variables are visited. This shouldn't encourage anyone to put global variables into clang modules. rdar://problem/24199640 llvm-svn: 258250
* [OpenMP] Parsing + sema for "target exit data" directive.Samuel Antao2016-01-193-0/+9
| | | | | | Patch by Arpith Jacob. Thanks! llvm-svn: 258177
* [OpenMP] Parsing + sema for "target enter data" directive.Samuel Antao2016-01-193-0/+9
| | | | | | Patch by Arpith Jacob. Thanks! llvm-svn: 258165
* Module Debugging: Defer the emission of anonymous tag declsAdrian Prantl2016-01-191-0/+4
| | | | | | | | | | | | | until we are visiting their declcontext. This fixes a regression introduced in r256962: When building debug info for a typdef'd anonymous tag type, we would be visiting the inner anonymous type first thus creating a "typedef changes linkage of anonymous type, but linkage was already computed" error. rdar://problem/24199640 llvm-svn: 258152
* Fix local variable name /NFCXinliang David Li2016-01-192-3/+3
| | | | llvm-svn: 258106
* fix formatting; NFCSanjay Patel2016-01-181-7/+5
| | | | llvm-svn: 258097
* Introduce -fsanitize-stats flag.Peter Collingbourne2016-01-168-5/+51
| | | | | | | | | This is part of a new statistics gathering feature for the sanitizers. See clang/docs/SanitizerStats.rst for further info and docs. Differential Revision: http://reviews.llvm.org/D16175 llvm-svn: 257971
* Add OpenMP dist_schedule clause to distribute directive and related ↵Carlo Bertolli2016-01-151-0/+1
| | | | | | regression tests. llvm-svn: 257917
* [X86] Support 'interrupt' attribute for x86Alexey Bataev2016-01-151-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This attribute may be attached to a function definition and instructs the backend to generate appropriate function entry/exit code so that it can be used directly as an interrupt handler. The IRET instruction, instead of the RET instruction, is used to return from interrupt or exception handlers. All registers, except for the EFLAGS register which is restored by the IRET instruction, are preserved by the compiler. Any interruptible-without-stack-switch code must be compiled with -mno-red-zone since interrupt handlers can and will, because of the hardware design, touch the red zone. interrupt handler must be declared with a mandatory pointer argument: struct interrupt_frame; __attribute__ ((interrupt)) void f (struct interrupt_frame *frame) { ... } and user must properly define the structure the pointer pointing to. exception handler: The exception handler is very similar to the interrupt handler with a different mandatory function signature: #ifdef __x86_64__ typedef unsigned long long int uword_t; #else typedef unsigned int uword_t; #endif struct interrupt_frame; __attribute__ ((interrupt)) void f (struct interrupt_frame *frame, uword_t error_code) { ... } and compiler pops the error code off stack before the IRET instruction. The exception handler should only be used for exceptions which push an error code and all other exceptions must use the interrupt handler. The system will crash if the wrong handler is used. Differential Revision: http://reviews.llvm.org/D15709 llvm-svn: 257867
* [CodeGen] Attach attributes to thread local wrapper function.Akira Hatanaka2016-01-151-4/+20
| | | | | | | | | | This commit is a follow-up to r251734, r251476, and r249735, which fixes a bug where function attributes were not attached to thread local wrapper functions. rdar://problem/20828324 llvm-svn: 257865
* [CUDA] Invoke ptxas and fatbinary during compilation.Justin Lebar2016-01-141-0/+2
| | | | | | | | | | | | | | | | | | | | Summary: Previously we compiled CUDA device code to PTX assembly and embedded that asm as text in our host binary. Now we compile to PTX assembly and then invoke ptxas to assemble the PTX into a cubin file. We gather the ptx and cubin files for each of our --cuda-gpu-archs and combine them using fatbinary, and then embed that into the host binary. Adds two new command-line flags, -Xcuda_ptxas and -Xcuda_fatbinary, which pass args down to the external tools. Reviewers: tra, echristo Subscribers: cfe-commits, jhen Differential Revision: http://reviews.llvm.org/D16082 llvm-svn: 257809
* Update for LLVM function name change.Rui Ueyama2016-01-1410-46/+43
| | | | llvm-svn: 257802
* PR25910: clang allows two var definitions with the same mangled nameAndrey Bokhanko2016-01-142-32/+95
| | | | | | | | | | | Proper diagnostic and resolution of mangled names' conflicts in variables. When there is a declaration and a definition using the same name but different types, we emit what is in the definition. When there are two conflicting definitions, we issue an error. Differential Revision: http://reviews.llvm.org/D15686 llvm-svn: 257754
* CodeGen: Only emit CFI unrelated cast checks for bit casts.Peter Collingbourne2016-01-141-1/+2
| | | | | | | We were previously emitting them for no-op casts (e.g. implicit casts to const). llvm-svn: 257738
* [Bugfix] Fix ICE on constexpr vector splat.George Burgess IV2016-01-135-14/+15
| | | | | | | | | | | | | In {CG,}ExprConstant.cpp, we weren't treating vector splats properly. This patch makes us treat splats more properly. Additionally, this patch adds a new cast kind which allows a bool->int cast to result in -1 or 0, instead of 1 or 0 (for true and false, respectively), so we can sanely model OpenCL bool->int casts in the AST. Differential Revision: http://reviews.llvm.org/D14877 llvm-svn: 257559
* Don't store CGOpenMPRegionInfo::CodeGen as a reference (PR26078)Hans Wennborg2016-01-121-1/+1
| | | | | | | The referenced llvm::function_ref<void(CodeGenFunction &)> object can go away before CodeGen is used, resulting in a crash. llvm-svn: 257516
* function names start with a lower case letter ; NFCSanjay Patel2016-01-121-1/+1
| | | | llvm-svn: 257497
* Fix -Wmicrosoft-enum-value warningReid Kleckner2016-01-111-1/+1
| | | | llvm-svn: 257383
* [OpenCL] Pipe type supportXiuli Pan2016-01-097-6/+58
| | | | | | | | | | | | | | | Summary: Support for OpenCL 2.0 pipe type. This is a bug-fix version for bader's patch reviews.llvm.org/D14441 Reviewers: pekka.jaaskelainen, Anastasia Subscribers: bader, Anastasia, cfe-commits Differential Revision: http://reviews.llvm.org/D15603 llvm-svn: 257254
* [MS ABI] Complete and base constructor GlobalDecls must have the same nameDavid Majnemer2016-01-081-1/+14
| | | | | | | | | | | | | | | | | Clang got itself into the situation where we mangled the same constructor twice with two different constructor types. After one of the constructors were utilized, the tag used for one of the types changed from class to struct because a class template became complete. This resulted in one of the constructor types varying from the other constructor. Instead, force "base" constructor types to "complete" if the ABI doesn't have constructor variants. This will ensure that GlobalDecls for both variants will get the same mangled name. This fixes PR26029. llvm-svn: 257205
* [ThinLTO] Leverage new in-place renaming supportTeresa Johnson2016-01-082-51/+34
| | | | | | | | | | | | Due to the new in-place renaming support added in r257174, we no longer need to invoke ThinLTO global renaming from clang. It will be invoked on the module in the FunctionImport pass (by an immediately following llvm commit). As a result, we don't need to load the FunctionInfoIndex as early, so that is moved down into EmitAssemblyHelper::EmitAssembly. llvm-svn: 257179
* [PGO] Simplify coverage mapping loweringXinliang David Li2016-01-073-5/+21
| | | | | | | | | | | | | | | | Coverage mapping data may reference names of functions that are skipped by FE (e.g, unused inline functions). Since those functions are skipped, normal instr-prof function lowering pass won't put those names in the right section, so special handling is needed to walk through coverage mapping structure and recollect the references. With this patch, only names that are skipped are processed. This simplifies the lowering code and it no longer needs to make assumptions coverage mapping data layout. It should also be more efficient. llvm-svn: 257092
* Module debugging: Defer emitting tag types until their definitionAdrian Prantl2016-01-061-3/+16
| | | | | | | | | was visited and all decls have been merged. We only get a single chance to emit the types for virtual classes because CGDebugInfo::completeRequiredType() categorically doesn't complete them. llvm-svn: 256962
* Fix -Wdocumentation warning after r256933Nico Weber2016-01-061-1/+1
| | | | llvm-svn: 256960
* [Driver] Add support for -fno-builtin-foo options.Chad Rosier2016-01-065-12/+21
| | | | | | | Addresses PR4941 and rdar://6756912. http://reviews.llvm.org/D15195 llvm-svn: 256937
* [OpenMP] Reapply rL256842: [OpenMP] Offloading descriptor registration and ↵Samuel Antao2016-01-064-41/+986
| | | | | | | | | | | | device codegen. This patch attempts to fix the regressions identified when the patch was committed initially. Thanks to Michael Liao for identifying the fix in the offloading metadata generation related with side effects in evaluation of function arguments. llvm-svn: 256933
* [OpenMP] Revert rL256842: [OpenMP] Offloading descriptor registration and ↵Samuel Antao2016-01-054-987/+41
| | | | | | | | device codegen. It was causing two regression, so I'm reverting until the cause is found. llvm-svn: 256858
* [OpenMP] Offloading descriptor registration and device codegen.Samuel Antao2016-01-054-41/+987
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In order to offloading work properly two things need to be in place: - a descriptor with all the offloading information (device entry functions, and global variable) has to be created by the host and registered in the OpenMP offloading runtime library. - all the device functions need to be emitted for the device and a convention has to be in place so that the runtime library can easily map the host ID of an entry point with the actual function in the device. This patch adds support for these two things. However, only entry functions are being registered given that 'declare target' directive is not yet implemented. About offloading descriptor: The details of the descriptor are explained with more detail in http://goo.gl/L1rnKJ. Basically the descriptor will have fields that specify the number of devices, the pointers to where the device images begin and end (that will be defined by the linker), and also pointers to a the begin and end of table whose entries contain information about a specific entry point. Each entry has the type: ``` struct __tgt_offload_entry{ void *addr; char *name; int64_t size; }; ``` and will be implemented in a pre determined (ELF) section `.omp_offloading.entries` with 1-byte alignment, so that when all the objects are linked, the table is in that section with no padding in between entries (will be like a C array). The code generation ensures that all `__tgt_offload_entry` entries are emitted in the same order for both host and device so that the runtime can have the corresponding entries in both host and device in same index of the table, and efficiently implement the mapping. The resulting descriptor is registered/unregistered with the runtime library using the calls `__tgt_register_lib` and `__tgt_unregister_lib`. The registration is implemented in a high priority global initializer so that the registration happens always before any initializer (that can potentially include target regions) is run. The driver flag -omptargets= was created to specify a comma separated list of devices the user wants to support so that the new functionality can be exercised. Each device is specified with its triple. About target codegen: The target codegen is pretty much straightforward as it reuses completely the logic of the host version for the same target region. The tricky part is to identify the meaningful target regions in the device side. Unlike other programming models, like CUDA, there are no already outlined functions with attributes that mark what should be emitted or not. So, the information on what to emit is passed in the form of metadata in host bc file. This requires a new option to pass the host bc to the device frontend. Then everything is similar to what happens in CUDA: the global declarations emission is intercepted to check to see if it is an "interesting" declaration. The difference is that instead of checking an attribute, the metadata information in checked. Right now, there is only a form of metadata to pass information about the device entry points (target regions). A class `OffloadEntriesInfoManagerTy` was created to manage all the information and queries related with the metadata. The metadata looks like this: ``` !omp_offload.info = !{!0, !1, !2, !3, !4, !5, !6} !0 = !{i32 0, i32 52, i32 77426347, !"_ZN2S12r1Ei", i32 479, i32 13, i32 4} !1 = !{i32 0, i32 52, i32 77426347, !"_ZL7fstatici", i32 461, i32 11, i32 5} !2 = !{i32 0, i32 52, i32 77426347, !"_Z9ftemplateIiET_i", i32 444, i32 11, i32 6} !3 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 99, i32 11, i32 0} !4 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 272, i32 11, i32 3} !5 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 127, i32 11, i32 1} !6 = !{i32 0, i32 52, i32 77426347, !"_Z3fooi", i32 159, i32 11, i32 2} ``` The fields in each metadata entry are (in sequence): Entry 1) an ID of the type of metadata - right now only zero is used meaning "OpenMP target region". Entry 2) a unique ID of the device where the input source file that contain the target region lives. Entry 3) a unique ID of the file where the input source file that contain the target region lives. Entry 4) a mangled name of the function that encloses the target region. Entries 5) and 6) line and column number where the target region was found. Entry 7) is the order the entry was emitted. Entry 2) and 3) are required to distinguish files that have the same function name. Entry 4) is required to distinguish different instances of the same declaration (usually templated ones) Entries 5) and 6) are required to distinguish the particular target region in body of the function (it is possible that a given target region is not an entry point - if clause can evaluate always to zero - and therefore we need to identify the "interesting" target regions. ) This patch replaces http://reviews.llvm.org/D12306. Reviewers: ABataev, hfinkel, tra, rjmccall, sfantao Subscribers: FBrygidyn, piotr.rak, Hahnfeld, cfe-commits Differential Revision: http://reviews.llvm.org/D12614 llvm-svn: 256842
* Remove setting of inlinehint and cold attributes based on profile dataEaswaran Raman2016-01-041-10/+0
| | | | | | | | | | | NFC. These hints are only used for inlining and the inliner now uses the same criteria to identify hot and cold callees and set appropriate thresholds without relying on these hints. Hence this removed code is superfluous. Differential Revision: http://reviews.llvm.org/D15726 llvm-svn: 256793
OpenPOWER on IntegriCloud