summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* CodeGen: Avoid dereferencing end() in ↵Duncan P. N. Exon Smith2016-08-171-3/+2
| | | | | | | | | | ScalarExprEmitter::EmitOverflowCheckedBinOp Use BB.getNextNode(), which returns nullptr on end(), instead of &*BB.getIterator(), which is UB on end(). CodeGenFunction::createBasicBlock expects nullptr in this case already. llvm-svn: 278898
* [PM] Update Clang for LLVM's r278896 which re-organized a header.Chandler Carruth2016-08-171-2/+3
| | | | | | (sorry this didn't get landed closer in time...) llvm-svn: 278897
* Emit debug info for dynamic classes if they are imported from a DLL.Adrian McCarthy2016-08-161-1/+6
| | | | | | | | | | With -debug-info-kind=limited, we omit debug info for dynamic classes that live in other TUs. This reduces duplicate type information. When statically linked, the type information comes together. But if your binary has a class derived from a base in a DLL, the base class info is not available to the debugger. The decision is made in shouldOmitDefinition (CGDebugInfo.cpp). Per a suggestion from rnk, I've tweaked the decision so that we do include definitions for classes marked as DLL imports. This should be a relatively small number of classes, so we don't pay a large price for duplication of the type info, yet it should cover most cases on Windows. Essentially this makes debug info for DLLs independent, but we still assume that all TUs within the same DLL will be consistently built with (or without) debug info and the debugger will be able to search across the debug info within that scope to resolve any declarations into definitions, etc. llvm-svn: 278861
* Revert "[X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows ↵Reid Kleckner2016-08-161-5/+1
| | | | | | | | platforms" This reverts commit r278783. It breaks usage of _xgetbv on Windows. llvm-svn: 278814
* Left shifts of negative values are defined if -fwrapv is setJames Molloy2016-08-161-1/+2
| | | | | | | This means we shouldn't emit ubsan detection code or warn. Fixes PR25552. llvm-svn: 278786
* [X86] Add xgetbv/x[X86] Add xgetbv xsetbv intrinsics to non-windows platformsMarina Yatsina2016-08-161-1/+5
| | | | | | | | commit on behalf of guyblank Differential Revision: https://reviews.llvm.org/D21959 llvm-svn: 278783
* Add the notion of deferred diagnostics.Justin Lebar2016-08-152-0/+41
| | | | | | | | | | | | | | | | | | | Summary: This patch lets you create diagnostics that are emitted if and only if a particular FunctionDecl is codegen'ed. This is necessary for CUDA, where some constructs -- e.g. calls from host+device functions to host functions when compiling for device -- are allowed to appear in semantically-correct programs, but only if they're never codegen'ed. Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D23241 llvm-svn: 278735
* [CodeGen] Ignore unnamed bitfields before handling vector fieldsDavid Majnemer2016-08-151-4/+5
| | | | | | | | | | | | We processed unnamed bitfields after our logic for non-vector field elements in records larger than 128 bits. The vector logic would determine that the bit-field disqualifies the record from occupying a register despite the unnamed bit-field not participating in the record size nor its alignment. N.B. This behavior matches GCC and ICC. llvm-svn: 278656
* [CodeGen] Correctly implement the AVX512 psABI rulesDavid Majnemer2016-08-151-7/+10
| | | | | | | | | | | | | | An __m512 vector type wrapped in a structure should be passed in a vector register. Our prior implementation was based on a draft version of the psABI. This fixes PR28975. N.B. The update to the ABI was made here: https://github.com/hjl-tools/x86-psABI/commit/30f9c9 llvm-svn: 278655
* P0217R3: code generation support for decomposition declarations.Richard Smith2016-08-153-3/+17
| | | | llvm-svn: 278642
* [CUDA] Place GPU binary into .nv_fatbin section and align it by 8.Artem Belevich2016-08-121-1/+10
| | | | | | | | | This matches the way nvcc encapsulates GPU binaries into host object file. Now cuobjdump can deal with clang-compiled object files. Differential Revision: https://reviews.llvm.org/D23429 llvm-svn: 278549
* CodeGen: Replace ThinLTO backend implementation with a client of LTO/Resolution.Teresa Johnson2016-08-122-33/+65
| | | | | | | | | | | | | | | | Summary: This changes clang to use the llvm::lto::thinBackend function instead of its own less comprehensive ThinLTO backend implementation. Patch by Peter Collingbourne Reviewers: tejohnson, mehdi_amini Subscribers: cfe-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D21545 llvm-svn: 278541
* [OpenCL] Change block descriptor address space to constant.Joey Gouly2016-08-101-2/+10
| | | | | | | The block descriptor is a GlobalVariable in the LLVM IR, so it shouldn't be in the private address space. llvm-svn: 278234
* [x86] Fix a really nasty bug introduced in r276417 where alignmentChandler Carruth2016-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | constraints were added to _mm256_broadcast_{pd,ps} intel intrinsics. The spec for these intrinics is ... pretty much silent on alignment. This is especially frustrating considering the amount of discussion of alignment in the load and store instrinsics. So I was forced to rely on the specification for the VBROADCASTF128 instruction. That instruction's spec is *also* completely silent on alignment. Fortunately, when it comes to the instruction's spec, silence is enough. There is no #GP fault option for an underaligned address so this instruction, and by inference the intrinsic, can read any alignment. As it happens, the old code worked exactly this way and in fact we have plenty of code that hands pointers with less than 16-byte alignment to these intrinsics. This code broke pretty spectacularly with this commit. Fortunately, the fix is super simple! Change a 16 to a 1, and ta da! Anyways, a lot of debugging for a really boring fix. =] llvm-svn: 278202
* [OpenCL] Handle -cl-fp32-correctly-rounded-divide-sqrtYaxun Liu2016-08-092-2/+10
| | | | | | | | Let the driver pass the option to frontend. Do not set precision metadata for division instructions when this option is set. Set function attribute "correctly-rounded-divide-sqrt-fp-math" based on this option. Differential Revision: https://reviews.llvm.org/D22940 llvm-svn: 278155
* Revert "[Attr] Add support for the `ms_hook_prologue` attribute."Charles Davis2016-08-081-8/+0
| | | | | | | This reverts commit r278050. It depends on r278048, which will be reverted. llvm-svn: 278052
* [Attr] Add support for the `ms_hook_prologue` attribute.Charles Davis2016-08-081-0/+8
| | | | | | | | | | | | | | | | | | | | | | | Summary: Based on a patch by Michael Mueller. This attribute specifies that a function can be hooked or patched. This mechanism was originally devised by Microsoft for hotpatching their binaries (which they're constantly updating to stay ahead of crackers, script kiddies, and other ne'er-do-wells on the Internet), but it's now commonly abused by Windows programs that want to hook API functions. It is for this reason that this attribute was added to GCC--hence the name, `ms_hook_prologue`. Depends on D19908. Reviewers: rnk, aaron.ballman Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D19909 llvm-svn: 278050
* [ARM] Command-line options for embedded position-independent codeOliver Stannard2016-08-081-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch (with the corresponding ARM backend patch) adds support for some new relocation models: * Read-only position independence (ROPI): Code and read-only data is accessed PC-relative. The offsets between all code and RO data sections are known at static link time. * Read-write position independence (RWPI): Read-write data is accessed relative to a static base register. The offsets between all writeable data sections are known at static link time. These two modes are independent (they specify how different objects should be addressed), so they can be used individually or together. These modes are intended for bare-metal systems or systems with small real-time operating systems. They are designed to avoid the need for a dynamic linker, the only initialisation required is setting the static base register to an appropriate value for RWPI code. There is one C construct not currently supported by these modes: global variables initialised to the address of another global variable or function, where that address is not known at static-link time. There are a few possible ways to solve this: * Disallow this, and require the user to write their own initialisation function if they need variables like this. * Emit dynamic initialisers for these variables in the compiler, called from the .init_array section (as is currently done for C++ dynamic initialisers). We have a patch to do this, described in my original RFC email (http://lists.llvm.org/pipermail/llvm-dev/2015-December/093022.html), but the feedback from that RFC thread was that this is not something that belongs in clang. * Use a small dynamic loader to fix up these variables, by adding the difference between the load and execution address of the relevant section. This would require linker co-operation to generate a table of addresses that need fixing up. Differential Revision: https://reviews.llvm.org/D23196 llvm-svn: 278016
* PR26423: Assert on valid use of using declaration of a function with an ↵David Blaikie2016-08-051-0/+10
| | | | | | | | | | | | undeduced auto return type For now just disregard the using declaration in this case. Suboptimal, but wiring up the ability to have declarations of functions that are separate from their definition (we currently only do that for member functions) and have differing return types (we don't have any support for that) is more work than seems reasonable to at least fix this crash. llvm-svn: 277852
* AMDGPU : Add Clang builtin intrinsics for compare with the fullWei Ding2016-08-051-0/+8
| | | | | | | | wavefront result. Differential Revision: http://reviews.llvm.org/D22934 llvm-svn: 277824
* [OpenMP] Sema and parsing for 'teams distribute' pragmaKelvin Li2016-08-053-0/+16
| | | | | | | | This patch is to implement sema and parsing for 'teams distribute' pragma. Differential Revision: https://reviews.llvm.org/D23189 llvm-svn: 277818
* [OpenCL] Added underscores to the names of 'to_addr' OpenCL built-ins.Alexey Bader2016-08-041-2/+3
| | | | | | | | | | | | | | | Summary: In order to re-define OpenCL built-in functions 'to_{private,local,global}' in OpenCL run-time library LLVM names must be different from the clang built-in function names. Reviewers: yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D23120 llvm-svn: 277743
* [OpenCL] Fix size of image typeYaxun Liu2016-08-033-14/+2
| | | | | | | | | | The size of image type is reported incorrectly as size of a pointer to address space 0, which causes error when casting image type to pointers by __builtin_astype. The fix is to get image address space from TargetInfo then report the size accordingly. Differential Revision: https://reviews.llvm.org/D22927 llvm-svn: 277647
* Add FIXMEs for MSVC 2013 hacks in r277211. NFC.Paul Robinson2016-08-011-0/+5
| | | | llvm-svn: 277396
* CodeGen: simplify the CC handling for TLS wrappersSaleem Abdulrasool2016-08-011-2/+1
| | | | | | | | | Use the calling convention of the wrapper directly to set the calling convention to ensure that the calling convention matches. Incorrectly setting the calling convention results in the code path being entirely nullified as InstCombine + SimplifyCFG will prune the mismatched CC calls. llvm-svn: 277390
* [codeview] Skip injected class names in nested record emissionReid Kleckner2016-08-011-0/+3
| | | | | | | | We were already trying to do this, but our check wasn't quite right. Fixes PR28790 llvm-svn: 277367
* Fix VS2013 build of CGOpenMPRuntime.cppHans Wennborg2016-07-301-3/+7
| | | | | | | | It seems the compiler was getting confused by the in-class initializers in local struct MapInfo, so moving those to a default constructor instead. llvm-svn: 277256
* Fix CGOpenMPRuntime.cpp for VS2013. NFC.Paul Robinson2016-07-291-10/+11
| | | | | | I don't know why these changes work but they do. llvm-svn: 277211
* CodeGen: try harder to make the CFString structure RWSaleem Abdulrasool2016-07-291-1/+1
| | | | | | | | The previous change was insufficient to mark the content as read-write as the structure itself was marked constant. Adjust this and add tests to ensure that the section is marked appropriately as being read-write. llvm-svn: 277200
* Initial vectorization support for svml calls (short vector math library).Matt Masten2016-07-291-0/+3
| | | | | | Differential Revision: https://reviews.llvm.org/D19544 llvm-svn: 277167
* [OpenCL] Generate opaque type for sampler_t and function call for the ↵Yaxun Liu2016-07-2811-7/+47
| | | | | | | | | | | | | | | | initializer Currently Clang use int32 to represent sampler_t, which have been a source of issue for some backends, because in some backends sampler_t cannot be represented by int32. They have to depend on kernel argument metadata and use IPA to find the sampler arguments and global variables and transform them to target specific sampler type. This patch uses opaque pointer type opencl.sampler_t* for sampler_t. For each use of file-scope sampler variable, it generates a function call of __translate_sampler_initializer. For each initialization of function-scope sampler variable, it generates a function call of __translate_sampler_initializer. Each builtin library can implement its own __translate_sampler_initializer(). Since the real sampler type tends to be architecture dependent, allowing it to be initialized by a library function simplifies backend design. A typical implementation of __translate_sampler_initializer could be a table lookup of real sampler literal values. Since its argument is always a literal, the returned pointer is known at compile time and easily optimized to finally become some literal values directly put into image read instructions. This patch is partially based on Alexey Sotkin's work in Khronos Clang (https://github.com/KhronosGroup/SPIR/commit/3d4eec61623502fc306e8c67c9868be2b136e42b). Differential Revision: https://reviews.llvm.org/D21567 llvm-svn: 277024
* [OpenMP] Change name of variable in mappble expression.Samuel Antao2016-07-281-7/+7
| | | | | | This attempts to fix a failure in Windows bots pottentially related with a reserved keyword. llvm-svn: 276988
* [OpenMP] Do not use default argument in lambda from mappable expressions ↵Samuel Antao2016-07-281-4/+7
| | | | | | | | handlers. Windows bots were complaining about that. llvm-svn: 276981
* [OpenMP] Code generation for the is_device_ptr clauseSamuel Antao2016-07-281-4/+40
| | | | | | | | | | | | Summary: This patch adds support for the is_device_ptr clause. It expands SEMA to use the mappable expression logic that can only be tested with code generation in place and check conflicts with other data sharing related clauses using the mappable expressions infrastructure. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22788 llvm-svn: 276978
* [OpenMP] Codegen for use_device_ptr clause.Samuel Antao2016-07-284-144/+425
| | | | | | | | | | | | Summary: This patch adds support for the use_device_ptr clause. It includes changes in SEMA that could not be tested without codegen, namely, the use of the first private logic and mappable expressions support. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22691 llvm-svn: 276977
* [OpenMP] Add support to map member expressions with references to pointers.Samuel Antao2016-07-271-3/+23
| | | | | | | | | | | | Summary: This patch add support to map pointers through references in class members. Although a reference does not have storage that a user can access, it still has to be mapped in order to get the deep copy right and the dereferencing code work properly. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22787 llvm-svn: 276934
* [OpenMP] Add support for mapping array sections through pointer references.Samuel Antao2016-07-272-9/+17
| | | | | | | | | | | | | | | Summary: This patch fixes a bug in the map of array sections whose base is a reference to a pointer. The existing mapping support was not prepared to deal with it, causing the compiler to crash. Mapping a reference to a pointer enjoys the same characteristics of a regular pointer, i.e., it is passed by value. Therefore, the reference has to be materialized in the target region. Reviewers: hfinkel, carlo.bertolli, kkwli0, ABataev Subscribers: caomhin, cfe-commits Differential Revision: https://reviews.llvm.org/D22690 llvm-svn: 276933
* [CUDA] Align kernel launch args correctly when the LLVM type's alignment is ↵Justin Lebar2016-07-272-25/+22
| | | | | | | | | | | | | | | | | | | | | | | different from the clang type's alignment. Summary: Before this patch, we computed the offsets in memory of args passed to GPU kernel functions by throwing all of the args into an LLVM struct. clang emits packed llvm structs basically whenever it feels like it, and packed structs have alignment 1. So we cannot rely on the llvm type's alignment matching the C++ type's alignment. This patch fixes our codegen so we always respect the clang types' alignments. Reviewers: rnk Subscribers: cfe-commits, tra Differential Revision: https://reviews.llvm.org/D22879 llvm-svn: 276927
* Don't crash when generating code for __attribute__((naked)) member functions.Justin Lebar2016-07-272-0/+8
| | | | | | | | | | | | | | Summary: Previously this crashed inside EmitThisParam(). There should be no prelude for naked functions, so just skip the whole thing. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22715 llvm-svn: 276925
* Add flags to toggle preservation of assembly commentsNirav Dave2016-07-271-0/+1
| | | | | | | | | | | | Summary: Add -fpreserve-as-comments and -fno-preserve-as-comments. Reviewers: echristo, rnk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D22883 llvm-svn: 276907
* Adjust coercion of aggregates on RenderScriptPirama Arumuga Nainar2016-07-271-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In RenderScript, the size of the argument or return value emitted in the IR is expected to be the same as the size of corresponding qualified type. For ARM and AArch64, the coercion performed by Clang can change the parameter or return value to a type whose size is different (usually larger) than the original aggregate type. Specifically, this can happen in the following cases: - Aggregate parameters of size <= 64 bytes and return values smaller than 4 bytes on ARM - Aggregate parameters and return values smaller than bytes on AArch64 This patch coerces the cases above to an integer array that is the same size and alignment as the original aggregate. A new field is added to TargetInfo to detect a RenderScript target and limit this coercion just to that case. Tests added to test/CodeGen/renderscript.c Reviewers: rsmith Subscribers: aemerson, srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D22822 llvm-svn: 276904
* [Coverage] Do not write out coverage mappings with zero entriesVedant Kumar2016-07-261-0/+11
| | | | | | | | | | | | | After r275121, we stopped mapping regions from system headers. Lambdas declared in regions belonging to system headers started producing empty coverage mappings, since the files corresponding to their spelling locs were being ignored. The coverage reader doesn't know what to do with these empty mappings. This commit makes sure that we don't produce them and adds a test. I'll make the reader stricter in a follow-up commit. llvm-svn: 276716
* [Profile] Use a flag to enable PGO rather than the profraw filenameXinliang David Li2016-07-231-0/+1
| | | | | | | | Patch by Jake VanAdrighem Differential Revision: http://reviews.llvm.org/D22608 llvm-svn: 276517
* P0217R3: Parsing support and framework for AST representation of C++1zRichard Smith2016-07-222-1/+5
| | | | | | | | | | | decomposition declarations. There are a couple of things in the wording that seem strange here: decomposition declarations are permitted at namespace scope (which we partially support here) and they are permitted as the declaration in a template (which we reject). llvm-svn: 276492
* [Profile] Enable profile merging with -fprofile-generat[=<dir>]Xinliang David Li2016-07-221-1/+1
| | | | | | | This patch enables raw profile merging for this option which is the new intended behavior. llvm-svn: 276484
* Clang changes for overloading invariant.start and end intrinsicsAnna Thomas2016-07-221-1/+3
| | | | | | | | | | | | | | | | | | | | This change depends on the corresponding LLVM change at: https://reviews.llvm.org/D22519 The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this LLVM change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. llvm-svn: 276448
* test commit. update comment grammatically. NFCAnna Thomas2016-07-221-1/+1
| | | | llvm-svn: 276425
* [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 with ↵Simon Pilgrim2016-07-221-0/+27
| | | | | | | | | | generic IR As discussed on D22460, I've updated the vbroadcastf128 pd256/ps256 builtins to map directly to generic IR - load+splat a 128-bit vector to both lanes of a 256-bit vector. Fix for PR28657. llvm-svn: 276417
* Reverting r275115 which caused PR28634.Wolfgang Pieb2016-07-211-13/+1
| | | | | | | When empty (forwarding) basic blocks that are referenced by user labels are removed, incorrect code may be generated. llvm-svn: 276361
* [CodeGen] Fix a crash when constant folding switch statementErik Pilkington2016-07-211-0/+8
| | | | | | Differential revision: https://reviews.llvm.org/D22542 llvm-svn: 276350
OpenPOWER on IntegriCloud