summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [OPENMP] Codegen for `distribute parallel for simd` directive.Alexey Bataev2017-11-271-7/+5
| | | | | | | Initial codegen for `#pragma omp distribute parallel for simd` directive and its clauses. llvm-svn: 319079
* [OPENMP] Improve handling of cancel directives in target-basedAlexey Bataev2017-11-272-6/+11
| | | | | | | | | constructs, NFC. Improved handling of cancel|cancellation point directives inside target-based for directives. llvm-svn: 319046
* [CodeGen] Collect information about sizes of accesses and access types for TBAAIvan A. Kosarev2017-11-275-55/+99
| | | | | | | | | | | | | | The information about access and type sizes is necessary for producing TBAA metadata in the new size-aware format. With this patch, D39955 and D39956 in place we should be able to change CodeGenTBAA::createScalarTypeNode() and CodeGenTBAA::getBaseTypeInfo() to generate metadata in the new format under the -new-struct-path-tbaa command-line option. For now, this new information remains unused. Differential Revision: https://reviews.llvm.org/D40176 llvm-svn: 319012
* [OPENMP] Add support for cancel constructs in `target teams distributeAlexey Bataev2017-11-221-1/+5
| | | | | | | | | parallel for`. Add support for cancel/cancellation point directives inside `target teams distribute parallel for` directives. llvm-svn: 318881
* [OPENMP] Add support for cancel constructs in [teams] distributeAlexey Bataev2017-11-221-7/+17
| | | | | | | | | parallel for directives. Added codegen/sema support for cancel constructs in [teams] distribute parallel for directives. llvm-svn: 318872
* Revert "[CodeGen] Fix vtable not receiving hidden visibility when using ↵Petr Hosek2017-11-226-21/+19
| | | | | | | | push(visibility)" This reverts commit r318853: tests are failing on Windows bots llvm-svn: 318866
* [CodeGen] Fix vtable not receiving hidden visibility when using push(visibility)Petr Hosek2017-11-226-19/+21
| | | | | | | | | | This change should resolve https://bugs.llvm.org/show_bug.cgi?id=35022 Patch by Jake Ehrlich Differential Revision: https://reviews.llvm.org/D39627 llvm-svn: 318853
* [OPENMP] Do not mark captured variables as artificial in debug info.Alexey Bataev2017-11-222-6/+32
| | | | | | | Captured variables should not be marked as artificial parameters in outlined functions in debug info. llvm-svn: 318843
* [OpenMP] Adjust arguments of nvptx runtime functionsJonas Hahnfeld2017-11-221-12/+20
| | | | | | | | | | | | | In the future the compiler will analyze whether the OpenMP runtime needs to be (fully) initialized and avoid that overhead if possible. The functions already take an argument to transfer that information to the runtime, so pass in the default value 1. (This is needed for binary compatibility with libomptarget-nvptx currently being upstreamed.) Differential Revision: https://reviews.llvm.org/D40354 llvm-svn: 318836
* [OPENMP] Codegen for `target teams` directive.Alexey Bataev2017-11-221-2/+11
| | | | | | Added codegen of the clauses for `target teams` directive. llvm-svn: 318834
* [X86] Update CPUSupports code to reuse LLVM .def file [NFC]Erich Keane2017-11-221-71/+5
| | | | llvm-svn: 318815
* [Clang][OpenMP] New clang/libomptarget map interface: new function ↵George Rokos2017-11-211-53/+57
| | | | | | | | | | | signatures, clang-side This clang patch changes the __tgt_* API function signatures in preparation for the new map interface. Changes are: Device IDs 32bits --> 64bits, Flags 32bits --> 64bits Differential revision: https://reviews.llvm.org/D40281 llvm-svn: 318789
* Add -finstrument-function-entry-bare flagHans Wennborg2017-11-211-9/+15
| | | | | | | | | | | | | | | | | This is an instrumentation flag that's similar to -finstrument-functions, but it only inserts calls on function entry, the calls are inserted post-inlining, and they don't take any arugments. This is intended for users who want to instrument function entry with minimal overhead. (-pg would be another alternative, but forces frame pointer emission and affects link flags, so is probably best left alone to be used for generating gcov data.) Differential revision: https://reviews.llvm.org/D40276 llvm-svn: 318785
* [OPENMP] Initial support for asynchronous data update, NFC.Alexey Bataev2017-11-211-3/+24
| | | | | | | | | OpenMP 5.0 introduces asynchronous data update/dependecies clauses on target data directives. Patch adds initial support for outer task regions to use task-based codegen for future async target data directives. llvm-svn: 318781
* [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs ↵Gheorghe-Teodor Bercea2017-11-212-23/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | using OpenMP device offloading Summary: This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker threads within that team. This patch is the first of three required for successfully performing the implicit sharing of master thread variables with the worker threads within a team. The remaining two patches are: - Patch D38978 to the LLVM NVPTX backend which ensures the lowering of shared variables to an device memory which allows the sharing of references; - Patch (coming soon) is a patch to libomptarget runtime library which ensures that a list of references to shared variables is properly maintained. A simple code snippet which illustrates an implicit data sharing situation is as follows: ``` #pragma omp target { // master thread only int v; #pragma omp parallel { // worker threads // use v } } ``` Variable v is implicitly shared from the team master thread which executes the code in between the target and parallel directives. The worker threads must operate on the latest version of v, including any updates performed by the master. The code generated in this patch relies on the LLVM NVPTX patch (mentioned above) which prevents v from being lowered in the thread local memory of the master thread thus making the reference to this variable un-shareable with the workers. This ensures that the code generated by this patch is correct. Since the parallel region is outlined the passing of arguments to the outlined regions must preserve the original order of arguments. The runtime therefore maintains a list of references to shared variables thus ensuring their passing in the correct order. The passing of arguments to the outlined parallel function is performed in a separate function which the data sharing infrastructure constructs in this patch. The function is inlined when optimizations are enabled. Reviewers: hfinkel, carlo.bertolli, arpith-jacob, Hahnfeld, ABataev, caomhin Reviewed By: ABataev Subscribers: cfe-commits, jholewinski Differential Revision: https://reviews.llvm.org/D38976 llvm-svn: 318773
* [CodeGen] Generate TBAA type descriptors in a more reliable mannerIvan A. Kosarev2017-11-212-43/+64
| | | | | | | | | This patch introduces a couple of helper functions that make it possible to handle the caching logic in a single place. Differential Revision: https://reviews.llvm.org/D39953 llvm-svn: 318752
* [OpenMP] Initial implementation of code generation for pragma 'teams ↵Carlo Bertolli2017-11-201-12/+22
| | | | | | | | | | distribute parallel for' on host https://reviews.llvm.org/D40187 This patch implements code gen for 'teams distribute parallel for' on the host, including all its clauses and related regression tests. llvm-svn: 318692
* [CodeGen] Move Reciprocals option from TargetOptions to CodeGenOptionsCraig Topper2017-11-201-1/+1
| | | | | | Diffrential Revision: https://reviews.llvm.org/D40226 llvm-svn: 318662
* Fix some -Wunused-variable warningsHans Wennborg2017-11-183-3/+0
| | | | llvm-svn: 318578
* [CodeGen] Compute the objc EH vtable address point using inbounds GEP.Ahmed Bougacha2017-11-171-2/+3
| | | | | | | | | | | | | | | | The object is provided by the objc runtime and is never visible in the module itself, but even so, the address point we compute points into it, and "+16" is guaranteed not to overflow. This matches the c++ vtable IRGen. Note that I'm not entirely convinced the 'i8*' type is correct here: at the IR level, we're accessing memory that's outside the global object. But we don't control the allocation, so it's not obviously wrong either. But either way, this is only in a global initializer, so I don't think it's going to be mucked with. Filed PR35352 to discuss that. llvm-svn: 318545
* [OPENMP] Codegen for `target simd` construct.Alexey Bataev2017-11-173-84/+114
| | | | | | Added codegen support for `target simd` directive. llvm-svn: 318536
* Update for layering fix in LLVM CodeGen<>TargetDavid Blaikie2017-11-171-1/+1
| | | | llvm-svn: 318491
* [MS] Apply adjustments after storing 'this'Reid Kleckner2017-11-165-71/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The MS ABI convention is that the 'this' pointer on entry is the address of the vfptr that was used to make the virtual method call. In other words, the pointer on entry always points to the base subobject that introduced the virtual method. Consider this hierarchy: struct A { virtual void f() = 0; }; struct B { virtual void g() = 0; }; struct C : A, B { void f() override; void g() override; }; On entry to C::g, [ER]CX will contain the address of C's B subobject, and C::g will have to subtract sizeof(A) to recover a pointer to C. Before this change, we applied this adjustment in the prologue and stored the new value into the "this" local variable alloca used for debug info. However, MSVC does not do this, presumably because it is often profitable to fold the adjustment into later field accesses. This creates a problem, because the debugger expects the variable to be unadjusted. Unfortunately, CodeView doesn't have anything like DWARF expressions for computing variables that aren't in the program anymore, so we have to declare 'this' to be the unadjusted value if we want the debugger to see the right value. This has the side benefit that, in optimized builds, the 'this' pointer will usually be available on function entry because it doesn't require any adjustment. Reviewers: hans Subscribers: aprantl, cfe-commits Differential Revision: https://reviews.llvm.org/D40109 llvm-svn: 318440
* [OPENMP] Add support for cancelling inside target parallel forAlexey Bataev2017-11-163-14/+18
| | | | | | | | directive. Added missed support for cancelling of target parallel for construct. llvm-svn: 318434
* [OpenCL] Fix code generation of function-scope constant samplers.Alexey Bader2017-11-151-0/+4
| | | | | | | | | | | | | | | | | Summary: Constant samplers are handled as static variables and clang's code generation library, which leads to llvm::unreachable. We bypass emitting sampler variable as static since it's translated to a function call later. Reviewers: yaxunl, Anastasia Reviewed By: yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D34342 llvm-svn: 318290
* Simplify CpuIs code to use include from LLVMErich Keane2017-11-151-89/+18
| | | | | | | | | | | | LLVM exposes a file in the backend (X86TargetParser.def) that contains information about the correct list of CpuIs values. This patch removes 2 of the copied and pasted versions of this list from clang and instead includes the data from the .def file. Differential Revision: https://reviews.llvm.org/D40054 llvm-svn: 318234
* [PGO] Detect more structural changes with the stable hashVedant Kumar2017-11-141-13/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lifting from Bob Wilson's notes: The hash value that we compute and store in PGO profile data to detect out-of-date profiles does not include enough information. This means that many significant changes to the source will not cause compiler warnings about the profile being out of date, and worse, we may continue to use the outdated profile data to make bad optimization decisions. There is some tension here because some source changes won't affect PGO and we don't want to invalidate the profile unnecessarily. This patch adds a new hashing scheme which is more sensitive to loop nesting, conditions, and out-of-order control flow. Here are examples which show snippets which get the same hash under the current scheme, and different hashes under the new scheme: Loop Nesting Example -------------------- // Snippet 1 while (foo()) { while (bar()) {} } // Snippet 2 while (foo()) {} while (bar()) {} Condition Example ----------------- // Snippet 1 if (foo()) bar(); baz(); // Snippet 2 if (foo()) bar(); else baz(); Out-of-order Control Flow Example --------------------------------- // Snippet 1 while (foo()) { if (bar()) {} baz(); } // Snippet 2 while (foo()) { if (bar()) continue; baz(); } In each of these cases, it's useful to differentiate between the snippets because swapping their profiles gives bad optimization hints. The new hashing scheme considers some logical operators in an effort to detect more changes in conditions. This isn't a perfect scheme. E.g, it does not produce the same hash for these equivalent snippets: // Snippet 1 bool c = !a || b; if (d && e) {} // Snippet 2 bool f = d && e; bool c = !a || b; if (f) {} This would require an expensive data flow analysis. Short of that, the new hashing scheme looks reasonably complete, based on a scan over the statements we place counters on. Profiles which use the old version of the PGO hash remain valid and can be used without issue (there are tests in tree which check this). rdar://17068282 Differential Revision: https://reviews.llvm.org/D39446 llvm-svn: 318229
* Switch -mcount and -finstrument-functions to emit EnterExitInstrumenter ↵Hans Wennborg2017-11-142-37/+18
| | | | | | | | | | | | | | | | attributes This updates -mcount to use the new attribute names (LLVM r318195), and switches over -finstrument-functions to also use these attributes rather than inserting instrumentation in the frontend. It also adds a new flag, -finstrument-functions-after-inlining, which makes the cygprofile instrumentation get inserted after inlining rather than before. Differential Revision: https://reviews.llvm.org/D39331 llvm-svn: 318199
* [NewPassManager] Pass the -fdebug-pass-manager flag setting into the ↵Craig Topper2017-11-141-4/+4
| | | | | | | | | | | | | | | | Analysis managers to match what we do in opt Summary: Currently the -fdebug-pass-manager flag for clang doesn't enable the debug logging in the analysis managers. This is different than what the switch does when passed to opt. Reviewers: chandlerc Reviewed By: chandlerc Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D40007 llvm-svn: 318140
* [PM] Wire up support for the bounds checking sanitizer with the new PM.Chandler Carruth2017-11-141-0/+14
| | | | | | | | | | | | | | | Not much interesting here. Mostly wiring things together. One thing worth noting is that the approach is substantially different from the old PM. Here, the -O0 case works fundamentally differently in that we just directly build the pipeline without any callbacks or other cruft. In some ways, this is nice and clean. However, I don't like that it causes the sanitizers to be enabled with different changes at different times. =/ Suggestions for a better way to do this are welcome. Differential Revision: https://reviews.llvm.org/D39085 llvm-svn: 318131
* [PM] Add a missing header that I had in the next commit but was neededChandler Carruth2017-11-141-0/+1
| | | | | | in r318128. Should fix the build. llvm-svn: 318130
* [PM] Port BoundsChecking to the new PM.Chandler Carruth2017-11-141-1/+1
| | | | | | | | | | | Registers it and everything, updates all the references, etc. Next patch will add support to Clang's `-fexperimental-new-pass-manager` path to actually enable BoundsChecking correctly. Differential Revision: https://reviews.llvm.org/D39084 llvm-svn: 318128
* OpenCL: Assume inline asm is convergentMatt Arsenault2017-11-131-4/+5
| | | | | | Already done for CUDA. llvm-svn: 318098
* [CodeGen] fix const-ness of cbrt and fmaSanjay Patel2017-11-131-9/+5
| | | | | | | | | | | cbrt() is always constant because it can't overflow or underflow. Therefore, it can't set errno. fma() is not always constant because it can overflow or underflow. Therefore, it can set errno. But we know that it never sets errno on GNU / MSVC, so make it constant in those environments. Differential Revision: https://reviews.llvm.org/D39641 llvm-svn: 318093
* [clang] Remove redundant return [NFC]Mandeep Singh Grang2017-11-131-1/+0
| | | | | | | | | | | | | | Reviewers: rsmith, sfantao, mcrosier Reviewed By: mcrosier Subscribers: jholewinski, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D39915 llvm-svn: 318074
* [ThinLTO] Handle -fdebug-pass-manager for backend invocations via clangTeresa Johnson2017-11-131-0/+1
| | | | | | | | | | | | | Recommit of r317951 and r317951 along with what I believe should fix the remaining buildbot failures - the target triple should be specified for both the ThinLTO pre-thinlink compile and backend (post-thinlink) compile to ensure it is consistent. Original description: The LTO Config field wasn't being set when invoking a ThinLTO backend via clang (i.e. for distributed builds). llvm-svn: 318042
* [coroutines] Promote cleanup.dest.slot allocas to registers to avoid storing ↵Gor Nishanov2017-11-112-17/+35
| | | | | | | | | | | | | | | | | | | | | | it in the coroutine frame Summary: We don't want to store cleanup dest slot saved into the coroutine frame (as some of the cleanup code may access them after coroutine frame destroyed). This is an alternative to https://reviews.llvm.org/D37093 It is possible to do this for all functions, but, cursory check showed that in -O0, we get slightly longer function (by 1-3 instructions), thus, we are only limiting cleanup.dest.slot elimination to coroutines. Reviewers: rjmccall, hfinkel, eric_niebler Reviewed By: eric_niebler Subscribers: EricWF, cfe-commits Differential Revision: https://reviews.llvm.org/D39768 llvm-svn: 317981
* Add CLANG_DEFAULT_OBJCOPY to allow Clang to use llvm-objcopy for dwarf fissionJake Ehrlich2017-11-112-41/+41
| | | | | | | | | | | | llvm-objcopy is getting to where it can be used in non-trivial ways (such as for dwarf fission in clang). It now supports dwarf fission but this feature hasn't been thoroughly tested yet. This change allows people to optionally build clang to use llvm-objcopy rather than GNU objcopy. By default GNU objcopy is still used so nothing should change. Differential Revision: https://reviews.llvm.org/D39029 llvm-svn: 317960
* Revert "[ThinLTO] Handle -fdebug-pass-manager for backend invocations via clang"Teresa Johnson2017-11-111-1/+0
| | | | | | | This reverts commit r317951 and r317952. The new test is aborting on some bots and I'll need to investigate later. llvm-svn: 317959
* [ThinLTO] Handle -fdebug-pass-manager for backend invocations via clangTeresa Johnson2017-11-101-0/+1
| | | | | | | | | | | | | | Summary: The LTO Config field wasn't being set when invoking a ThinLTO backend via clang (i.e. for distributed builds). Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D39923 llvm-svn: 317951
* Remove declaration of EmitMCountInstrumentation(). NFCHans Wennborg2017-11-101-3/+0
| | | | | | The definition was removed in r280355. llvm-svn: 317944
* [OPENMP] Codegen for `#pragma omp target parallel for simd`.Alexey Bataev2017-11-094-12/+47
| | | | | | Added codegen for `#pragma omp target parallel for simd` and clauses. llvm-svn: 317813
* Fix a bug with the use of __builtin_bzero in a conditional expression.John McCall2017-11-091-1/+1
| | | | | | Patch by Bharathi Seshadri! llvm-svn: 317776
* [Coverage] Emit deferred regions in headersVedant Kumar2017-11-091-3/+5
| | | | | | | | | | | There are some limitations with emitting regions in macro expansions because we don't gather file IDs within the expansions. Fix the check that prevents us from emitting deferred regions in expansions to make an exception for headers, which is something we can handle. rdar://35373009 llvm-svn: 317760
* [Coverage] Complete top-level deferred regions before labelsVedant Kumar2017-11-091-3/+38
| | | | | | | | | | | | | | | | | | | The area immediately after a terminated region in the function top-level should have the same count as the label it precedes. This solves another problem with wrapped segments. Consider: 1| a: 2| return 0; 3| b: 4| return 1; Without a gap area starting after the first return, the wrapped segment from line 2 would make it look like line 3 is executed, when it's not. rdar://35373009 llvm-svn: 317759
* [Coverage] Emit a gap area after if conditionsVedant Kumar2017-11-091-1/+26
| | | | | | | | | | | | | | | | | | | The area immediately after the closing right-paren of an if condition should have a count equal to the 'then' block's count. Use a gap region to set this count, so that region highlighting for the 'then' block remains precise. This solves a problem we have with wrapped segments. Consider: 1| if (false) 2| foo(); Without a gap area starting after the condition, the wrapped segment from line 1 would make it look like line 2 is executed, when it's not. rdar://35373009 llvm-svn: 317758
* [OPENMP] Codegen for `#pragma omp target parallel for`.Alexey Bataev2017-11-084-9/+43
| | | | llvm-svn: 317719
* [NVPTX] Implement __nvvm_atom_add_gen_d builtin.Justin Lebar2017-11-071-0/+10
| | | | | | | | | | | | | | | Summary: This just seems to have been an oversight. We already supported the f64 atomic add with an explicit scope (e.g. "cta"), but not the scopeless version. Reviewers: tra Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39638 llvm-svn: 317623
* New clang option -fno-plt which avoids the PLT and lazy binding while making ↵Sriraman Tallam2017-11-071-0/+10
| | | | | | | | external calls. Differential Revision: https://reviews.llvm.org/D39079 llvm-svn: 317605
* Clang/libomptarget map interface flag renaming - NFC patchGeorge Rokos2017-11-071-41/+38
| | | | | | | | | | This patch renames some of the flag names of the clang/libomptarget map interface. The old names are slightly misleading, whereas the new ones describe in a better way what each flag is about. Only the macros within the enumeration are renamed, there is no change in functionality therefore there are no updated regression tests. Differential Revision: https://reviews.llvm.org/D39745 llvm-svn: 317598
OpenPOWER on IntegriCloud