summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove now redundant code that ensured debug info for class definitions was ↵David Blaikie2017-01-182-18/+0
| | | | | | | | | | | emitted under certain circumstances Introduced in r181561 - it may've been subsumed by work done to allow emission of declarations for vtable types while still emitting some of their member functions correctly for those declarations. Whatever the reason, the tests pass without this code now. llvm-svn: 292439
* [OpenMP] Support for the if-clause on the combined directive 'target parallel'.Arpith Chacko Jacob2017-01-181-11/+30
| | | | | | | | | | | | | | | | | | | | | | | The if-clause on the combined directive potentially applies to both the 'target' and the 'parallel' regions. Codegen'ing the if-clause on the combined directive requires additional support because the expression in the clause must be captured by the 'target' capture statement but not the 'parallel' capture statement. Note that this situation arises for other clauses such as num_threads. The OMPIfClause class inherits OMPClauseWithPreInit to support capturing of expressions in the clause. A member CaptureRegion is added to OMPClauseWithPreInit to indicate which captured statement (in this case 'target' but not 'parallel') captures these expressions. To ensure correct codegen of captured expressions in the presence of combined 'target' directives, OMPParallelScope was added to 'parallel' codegen. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28781 llvm-svn: 292437
* [OpenMP] Codegen for the 'target parallel' directive on the NVPTX device.Arpith Chacko Jacob2017-01-182-9/+258
| | | | | | | | | | | | | | | | | | This patch adds codegen for the 'target parallel' directive on the NVPTX device. We term offload OpenMP directives such as 'target parallel' and 'target teams distribute parallel for' as SPMD constructs. SPMD constructs, in contrast to Generic ones like the plain 'target', can never contain a serial region. SPMD constructs can be handled more efficiently on the GPU and do not require the Warp Loop of the Generic codegen scheme. This patch adds SPMD codegen support for 'target parallel' on the NVPTX device and can be reused for other SPMD constructs. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28755 llvm-svn: 292428
* [OpenMP] Codegen support for 'target parallel' on the host.Arpith Chacko Jacob2017-01-186-35/+112
| | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292419
* Revert r292374 to debug Windows buildbot failure.Arpith Chacko Jacob2017-01-186-112/+35
| | | | llvm-svn: 292400
* [OpenMP] Codegen support for 'target parallel' on the host.Arpith Chacko Jacob2017-01-186-35/+112
| | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for codegen of 'target parallel' on the host. It is also the first combined directive that requires two or more captured statements. Support for this functionality is included in the patch. A combined directive such as 'target parallel' has two captured statements, one for the 'target' and the other for the 'parallel' region. Two captured statements are required because each has different implicit parameters (see SemaOpenMP.cpp). For example, the 'parallel' has 'global_tid' and 'bound_tid' while the 'target' does not. The patch adds support for handling multiple captured statements based on the combined directive. When codegen'ing the 'target parallel' directive, the 'target' outlined function is created using the outer captured statement and the 'parallel' outlined function is created using the inner captured statement. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28753 llvm-svn: 292374
* [WebAssembly] Add minimal support for the new wasm object format triple.Dan Gohman2017-01-173-4/+9
| | | | llvm-svn: 292269
* [OpenMP] Refactor code that calls codegen for target regions on the device.Arpith Chacko Jacob2017-01-163-54/+69
| | | | | | | | | | | | This patch refactors code that calls codegen for target regions. Currently the codebase only supports the 'target' directive. The patch pulls out common target processing code into a static function that can be called by codegen for any target directive. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28752 llvm-svn: 292134
* Remove unused lambda captures. NFCMalcolm Parsons2017-01-133-22/+19
| | | | llvm-svn: 291939
* Use less byval on 32-bit Windows x86 for classes with basesReid Kleckner2017-01-131-22/+38
| | | | | | | | | | | | This comes up in V8, which has a Handle template class that wraps a typed pointer, and is frequently passed by value. The pointer is stored in the base, HandleBase. This change allows us to pass the struct as a pointer instead of using byval. This avoids creating tons of temporary allocas that we copy from during call lowering. Eventually, it would be good to use FCAs here instead. llvm-svn: 291917
* Pass -fprofile-sample-use to lto backends.Dehao Chen2017-01-131-2/+5
| | | | | | | | | | | | Summary: LTO backend will not invoke SampleProfileLoader pass even if -fprofile-sample-use is specified. This patch passes the flag down so that pass manager can add the SampleProfileLoader pass correctly. Reviewers: mehdi_amini, tejohnson Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28588 llvm-svn: 291870
* [tsan] Do not report errors in __destroy_helper_block_Anna Zaks2017-01-131-3/+11
| | | | | | | | | | There is a synchronization point between the reference count of a block dropping to zero and it's destruction, which TSan does not observe. Do not report errors in the compiler-emitted block destroy method and everything called from it. This is similar to https://reviews.llvm.org/D25857 Differential Revision: https://reviews.llvm.org/D28387 llvm-svn: 291868
* Improve handling of instantiated thread_local variables in Itanium C++ ABI.Richard Smith2017-01-132-14/+31
| | | | | | | | | | | | | | | | | * Do not initialize these variables when initializing the rest of the thread_locals in the TU; they have unordered initialization so they can be initialized by themselves. This fixes a rejects-valid bug: we would make the per-variable initializer function internal, but put it in a comdat keyed off the variable, resulting in link errors when the comdat is selected from a different TU (as the per TU TLS init function tries to call an init function that does not exist). * On Darwin, when we decide that we're not going to emit a thread wrapper function at all, demote its linkage to External. Fixes a verifier failure on explicit instantiation of a thread_local variable on Darwin. llvm-svn: 291865
* Revert r291774 which caused buildbot failure.Dehao Chen2017-01-121-5/+2
| | | | llvm-svn: 291775
* Pass -fprofile-sample-use to lto backends.Dehao Chen2017-01-121-2/+5
| | | | | | | | | | | | Summary: LTO backend will not invoke SampleProfileLoader pass even if -fprofile-sample-use is specified. This patch passes the flag down so that pass manager can add the SampleProfileLoader pass correctly. Reviewers: mehdi_amini, tejohnson Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28588 llvm-svn: 291774
* Module: Do not add any link flags when an implementation TU of a module importsManman Ren2017-01-111-1/+7
| | | | | | | | | | | a header of that same module. This fixes a regression caused by r280409. rdar://problem/29930553 This is an updated version for r291628 (which was reverted in r291688). llvm-svn: 291689
* [ARM] Use generic bitreverse intrinsic, rather than ARM specific rbit.Chad Rosier2017-01-101-3/+3
| | | | | | The backend already supports lowering this intrinsic to a rbit instruction. llvm-svn: 291582
* [OpenMP] Sema and parsing for 'target teams distribute simd’ pragmaKelvin Li2017-01-103-0/+16
| | | | | | | | This patch is to implement sema and parsing for 'target teams distribute simd’ pragma. Differential Revision: https://reviews.llvm.org/D28252 llvm-svn: 291579
* CGDecl: Skip static variable initializers in unreachable codeMatthias Braun2017-01-101-2/+2
| | | | | | | | This fixes http://llvm.org/PR31054 Differential Revision: https://reviews.llvm.org/D28505 llvm-svn: 291576
* [AArch64] Use generic bitreverse intrinsic, rather than AArch64 specific.Chad Rosier2017-01-101-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D28400 llvm-svn: 291574
* [OpenMP] Basic support for a parallel directive in a target region on an ↵Arpith Chacko Jacob2017-01-104-27/+294
| | | | | | | | | | | | | | | | | | | | | NVPTX device Summary: This patch introduces support for the execution of parallel constructs in a target region on the NVPTX device. Parallel regions must be in the lexical scope of the target directive. The master thread in the master warp signals parallel work for worker threads in worker warps on encountering a parallel region. Note: The patch does not yet support capture of arguments in a parallel region so the test cases are simple. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28145 llvm-svn: 291565
* Use the correct ObjC EH personalityBenjamin Kramer2017-01-082-0/+10
| | | | | | | | This fixes ObjC exceptions on Win64 (which uses SEH), among others. Patch by Jonathan Schleifer! llvm-svn: 291408
* [ThinLTO] Optionally ignore empty index fileTeresa Johnson2017-01-061-16/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In order to simplify distributed build system integration, where actions may be scheduled before the Thin Link which determines the list of objects selected by the linker. The gold plugin currently will emit 0-sized index files for objects not selected by the link, to enable checking for expected output files by the build system. If the build system then schedules a backend action for these bitcode files, we want to be able to fall back to normal compilation instead of failing. Fallback is enabled under an option in LLVM (D28410), in which case a nullptr is returned from llvm::getModuleSummaryIndexForFile. Clang can just proceed with non-ThinLTO compilation in that case. I am investigating whether this can be addressed in our build system, but that is a longer term fix and so this enables a workaround in the meantime. Reviewers: mehdi_amini Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28362 llvm-svn: 291303
* Add a cc1 option to force disabling lifetime-markers emission from clangMehdi Amini2017-01-062-1/+5
| | | | | | | | Summary: This intended as a debugging/development flag only. Differential Revision: https://reviews.llvm.org/D28385 llvm-svn: 291300
* Use CodegenOpts::less when creating a TargetMachine for clang `-O1`Mehdi Amini2017-01-061-4/+15
| | | | | | | | | | | | | | | | | | | Summary: Clang was initializing the TargetMachine with CodeGenOpt::Default for O1. This change is aligning it on llc: -O0: OptLevel = CodeGenOpt::None -O1: OptLevel = CodeGenOpt::Less -O2 -Os -Oz: OptLevel = CodeGenOpt::Default -O3: OptLevel = CodeGenOpt::Aggressive Reviewers: echristo, chandlerc Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28409 llvm-svn: 291276
* Clean up redundant isa<T> before getAs<T>. NFC.George Burgess IV2017-01-061-4/+2
| | | | llvm-svn: 291264
* [ubsan] Minimize size of data for type_mismatch (Redo of D19667)Filipe Cabecinhas2017-01-062-6/+7
| | | | | | | | | | | | | | | | | | Summary: This patch makes the type_mismatch static data 7 bytes smaller (and it ends up being 16 bytes smaller due to alignment restrictions, at least on some x86-64 environments). It revs up the type_mismatch handler version since we're breaking binary compatibility. I will soon post a patch for the compiler-rt side. Reviewers: rsmith, kcc, vitalybuka, pgousseau, gbedwell Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28242 llvm-svn: 291236
* Add vec_insert4b and vec_extract4b functions to altivec.hSean Fertile2017-01-051-0/+84
| | | | | | | | | Add builtins for the functions and custom codegen mapping the builtins to their corresponding intrinsics and handling the endian related swapping. https://reviews.llvm.org/D26546 llvm-svn: 291179
* [OpenMP] Add fields for flags in the offload entry descriptor.Samuel Antao2017-01-054-22/+43
| | | | | | | | | | | | | | | | | Summary: This patch adds two fields to the offload entry descriptor. One field is meant to signal Ctors/Dtors and `link` global variables, and the other is reserved for runtime library use. Currently, these fields are only filled with zeros in the current code generation, but that will change when `declare target` is added. The reason, we are adding these fields now is to make the code generation consistent with the runtime library proposal under review in https://reviews.llvm.org/D14031. Reviewers: ABataev, hfinkel, carlo.bertolli, kkwli0, arpith-jacob, Hahnfeld Subscribers: cfe-commits, caomhin, jholewinski Differential Revision: https://reviews.llvm.org/D28298 llvm-svn: 291124
* CodeGen: plumb header search down to the IASSaleem Abdulrasool2017-01-053-19/+36
| | | | | | | | | | | inline assembly may use the `.include` directive to include other content into the file. Without the integrated assembler, the `-I` group gets passed to the assembler. Emulate this by collecting the header search paths and passing them to the IAS. Resolves PR24811! llvm-svn: 291123
* [OpenMP] Update target codegen for NVPTX device.Arpith Chacko Jacob2017-01-052-162/+153
| | | | | | | | | | | | | | | | | This patch includes updates for codegen of the target region for the NVPTX device. It moves initializers from the compiler to the runtime and updates the worker loop to assume parallel work is retrieved from the runtime. A subsequent patch will update the codegen to retrieve the parallel work using calls to the runtime. It includes the removal of the inline attribute for the worker loop and disabling debug info in it. This allows codegen for a target directive and serial execution on the NVPTX device. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28125 llvm-svn: 291121
* Correct Vectorcall Register passing and HVA BehaviorErich Keane2017-01-051-26/+180
| | | | | | | | | | | | | Front end component (back end changes are D27392). The vectorcall calling convention was broken subtly in two cases. First, it didn't properly handle homogeneous vector aggregates (HVAs). Second, the vectorcall specification requires that only the first 6 parameters be eligible for register assignment. This patch fixes both issues. Differential Revision: https://reviews.llvm.org/D27529 llvm-svn: 291041
* Reverting commit r290983 while debugging test failure on windows.Arpith Chacko Jacob2017-01-042-153/+162
| | | | llvm-svn: 290989
* [OpenMP] Update target codegen for NVPTX device.Arpith Chacko Jacob2017-01-042-162/+153
| | | | | | | | | | | | | | | | | This patch includes updates for codegen of the target region for the NVPTX device. It moves initializers from the compiler to the runtime and updates the worker loop to assume parallel work is retrieved from the runtime. A subsequent patch will update the codegen to retrieve the parallel work using calls to the runtime. It includes the removal of the inline attribute for the worker loop and disabling debug info in it. This allows codegen for a target directive and serial execution on the NVPTX device. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28125 llvm-svn: 290983
* Add -f[no-]strict-return flag that can be used to avoid undefined behaviourAlex Lorenz2017-01-041-4/+23
| | | | | | | | | | | | | | | | | | | | | | | in non-void functions that fall off at the end without returning a value when compiling C++. Clang uses the new compiler flag to determine when it should treat control flow paths that fall off the end of a non-void function as unreachable. If -fno-strict-return is on, the code generator emits the ureachable and trap IR only when the function returns either a record type with a non-trivial destructor or another non-trivially copyable type. The primary goal of this flag is to avoid treating falling off the end of a non-void function as undefined behaviour. The burden of undefined behaviour is placed on the caller instead: if the caller ignores the returned value then the undefined behaviour is avoided. This kind of behaviour is useful in several cases, e.g. when compiling C code in C++ mode. rdar://13102603 Differential Revision: https://reviews.llvm.org/D27163 llvm-svn: 290960
* [Win64] Don't widen integer literal zero arguments to unprototyped function ↵Reid Kleckner2017-01-031-1/+1
| | | | | | | | | | | | | | | | | | | | | calls The special case to widen the integer literal zero when passed to variadic function calls should only apply to variadic functions, not unprototyped functions. This is consistent with what MSVC does. In this test case, MSVC uses a 4-byte store to pass the 5th argument to 'kr' and an 8-byte store to pass the zero to 'v': void v(int, ...); void kr(); void f(void) { v(1, 2, 3, 4, 0); kr(1, 2, 3, 4, 0); } Aaron Ballman discovered this issue in https://reviews.llvm.org/D28166 llvm-svn: 290906
* [OpenMP] Code cleanup for NVPTX OpenMP codegenArpith Chacko Jacob2017-01-032-65/+31
| | | | | | | | | | | This patch cleans up private methods for NVPTX OpenMP codegen. It converts private members to static functions to follow the coding style of CGOpenMPRuntime.cpp and declutter the header file. Reviewers: ABataev Differential Revision: https://reviews.llvm.org/D28124 llvm-svn: 290904
* [OPENMP] Private, firstprivate, and lastprivate clauses for distribute, host ↵Carlo Bertolli2017-01-031-0/+18
| | | | | | | | | | | | code generation https://reviews.llvm.org/D17840 This patch enables private, firstprivate, and lastprivate clauses for the OpenMP distribute directive. Regression tests differ from the similar case of the same clauses on the for directive, by removing a reference to two global variables g and g1. This is necessary because: 1. a distribute pragma is only allowed inside a target region; 2. referring a global variable (e.g. g and g1) in a target region requires the program to enclose the variable in a "declare target" region; 3. declare target pragmas, which are used to define a declare target region, are currently unavailable in clang (patch being prepared). For this reason, I moved the global declarations into local variables. llvm-svn: 290898
* [OpenMP] Sema and parsing for 'target teams distribute parallel for simd’ ↵Kelvin Li2017-01-033-0/+16
| | | | | | | | | | pragma This patch is to implement sema and parsing for 'target teams distribute parallel for simd’ pragma. Differential Revision: https://reviews.llvm.org/D28202 llvm-svn: 290862
* CodeGen: update comment about RTTI fieldSaleem Abdulrasool2017-01-011-1/+1
| | | | | | | | | | The MS ABI RTTI has a reserved field which is used as a cache for the demangled name. It must be zero-initialized, which is used as a hint by the runtime to say that the cache has not been populated. Since this field is populated at runtime, the RTTI structures must be placed in the .data section rather than .rdata. NFC llvm-svn: 290799
* CodeGen: use a StringSwitch instead of cascasding ifsSaleem Abdulrasool2016-12-301-15/+8
| | | | | | | Change the cascading ifs to a StringSwitch to simplify the conversion of the relocation model. NFC llvm-svn: 290762
* [OpenMP] Sema and parsing for 'target teams distribute parallel for’ pragmaKelvin Li2016-12-293-0/+16
| | | | | | | | This patch is to implement sema and parsing for 'target teams distribute parallel for’ pragma. Differential Revision: https://reviews.llvm.org/D28160 llvm-svn: 290725
* [ItaniumABI] NFC changesPiotr Padlewski2016-12-281-2/+3
| | | | llvm-svn: 290677
* [ThinLTO] No need to rediscover imports in distributed backendTeresa Johnson2016-12-281-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | Summary: We can simply import all external values with summaries included in the individual index file created for the distributed backend job, as only those are added to the individual index file created by the WriteIndexesThinBackend (in addition to summaries for the original module, which are skipped here). While computing the cross module imports on this index would come to the same conclusion as the original thin link import logic, it is unnecessary work. And when tuning, it avoids the need to pass the same function importing parameters (e.g. -import-instr-limit) to both the thin link and the backends (otherwise they won't make the same decisions). Reviewers: mehdi_amini, pcc Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D28139 llvm-svn: 290674
* Fix format. NFCKelvin Li2016-12-281-5/+8
| | | | llvm-svn: 290673
* [CodeGen] Unique constant CompoundLiterals.George Burgess IV2016-12-282-4/+32
| | | | | | | | | | | | | | | | | | | | Our newly aggressive constant folding logic makes it possible for CGExprConstant to see the same CompoundLiteralExpr more than once. So, emitting a new GlobalVariable every time we see a CompoundLiteral is no longer correct. We had a similar issue with BlockExprs that was caught while testing said aggressive folding, so I applied the same style of fix (see D26410) here. If we find yet another case where this needs to happen, we should probably refactor this so we don't have a third DenseMap+getter+setter. As a design note: getAddrOfConstantCompoundLiteralIfEmitted is really only intended to be called by ConstExprEmitter::EmitLValue. So, returning a GlobalVariable* instead of a ConstantAddress costs us effectively nothing, and saves us either a few bytes per entry in our map or a bit of code duplication. llvm-svn: 290661
* DebugInfo: Don't include size/alignment on class declarationsDavid Blaikie2016-12-271-6/+0
| | | | | | | | | | This seems like it must've been a leftover by accident - no tests were backing it up & it doesn't make much sense to include size/alignment on class declarations (it'd only be on those declarations for which the definition was available - otherwise the size/alignment would not be known). llvm-svn: 290631
* [PH] Teach the new PM code path to support -disable-llvm-passes.Chandler Carruth2016-12-271-10/+13
| | | | | | | | | | | This is kind of funny because I specifically did work to make this easy and then it didn't actually get implemented. I've also ported a set of tests that rely on this functionality to run with the new PM as well as the old PM so that we don't mess this up in the future. llvm-svn: 290558
* [DebugInfo] Added support for Checksum debug info feature.Amjad Aboud2016-12-252-5/+48
| | | | | | Differential Revision: https://reviews.llvm.org/D27641 llvm-svn: 290515
* [OpenMP] Sema and parsing for 'target teams distribute' pragmaKelvin Li2016-12-253-0/+14
| | | | | | | | This patch is to implement sema and parsing for 'target teams distribute' pragma. Differential Revision: https://reviews.llvm.org/D28015 llvm-svn: 290508
OpenPOWER on IntegriCloud