summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [CodeGen] Generate TBAA type descriptors in a more reliable mannerIvan A. Kosarev2017-11-212-43/+64
| | | | | | | | | This patch introduces a couple of helper functions that make it possible to handle the caching logic in a single place. Differential Revision: https://reviews.llvm.org/D39953 llvm-svn: 318752
* [OpenMP] Initial implementation of code generation for pragma 'teams ↵Carlo Bertolli2017-11-201-12/+22
| | | | | | | | | | distribute parallel for' on host https://reviews.llvm.org/D40187 This patch implements code gen for 'teams distribute parallel for' on the host, including all its clauses and related regression tests. llvm-svn: 318692
* [CodeGen] Move Reciprocals option from TargetOptions to CodeGenOptionsCraig Topper2017-11-201-1/+1
| | | | | | Diffrential Revision: https://reviews.llvm.org/D40226 llvm-svn: 318662
* Fix some -Wunused-variable warningsHans Wennborg2017-11-183-3/+0
| | | | llvm-svn: 318578
* [CodeGen] Compute the objc EH vtable address point using inbounds GEP.Ahmed Bougacha2017-11-171-2/+3
| | | | | | | | | | | | | | | | The object is provided by the objc runtime and is never visible in the module itself, but even so, the address point we compute points into it, and "+16" is guaranteed not to overflow. This matches the c++ vtable IRGen. Note that I'm not entirely convinced the 'i8*' type is correct here: at the IR level, we're accessing memory that's outside the global object. But we don't control the allocation, so it's not obviously wrong either. But either way, this is only in a global initializer, so I don't think it's going to be mucked with. Filed PR35352 to discuss that. llvm-svn: 318545
* [OPENMP] Codegen for `target simd` construct.Alexey Bataev2017-11-173-84/+114
| | | | | | Added codegen support for `target simd` directive. llvm-svn: 318536
* Update for layering fix in LLVM CodeGen<>TargetDavid Blaikie2017-11-171-1/+1
| | | | llvm-svn: 318491
* [MS] Apply adjustments after storing 'this'Reid Kleckner2017-11-165-71/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The MS ABI convention is that the 'this' pointer on entry is the address of the vfptr that was used to make the virtual method call. In other words, the pointer on entry always points to the base subobject that introduced the virtual method. Consider this hierarchy: struct A { virtual void f() = 0; }; struct B { virtual void g() = 0; }; struct C : A, B { void f() override; void g() override; }; On entry to C::g, [ER]CX will contain the address of C's B subobject, and C::g will have to subtract sizeof(A) to recover a pointer to C. Before this change, we applied this adjustment in the prologue and stored the new value into the "this" local variable alloca used for debug info. However, MSVC does not do this, presumably because it is often profitable to fold the adjustment into later field accesses. This creates a problem, because the debugger expects the variable to be unadjusted. Unfortunately, CodeView doesn't have anything like DWARF expressions for computing variables that aren't in the program anymore, so we have to declare 'this' to be the unadjusted value if we want the debugger to see the right value. This has the side benefit that, in optimized builds, the 'this' pointer will usually be available on function entry because it doesn't require any adjustment. Reviewers: hans Subscribers: aprantl, cfe-commits Differential Revision: https://reviews.llvm.org/D40109 llvm-svn: 318440
* [OPENMP] Add support for cancelling inside target parallel forAlexey Bataev2017-11-163-14/+18
| | | | | | | | directive. Added missed support for cancelling of target parallel for construct. llvm-svn: 318434
* [OpenCL] Fix code generation of function-scope constant samplers.Alexey Bader2017-11-151-0/+4
| | | | | | | | | | | | | | | | | Summary: Constant samplers are handled as static variables and clang's code generation library, which leads to llvm::unreachable. We bypass emitting sampler variable as static since it's translated to a function call later. Reviewers: yaxunl, Anastasia Reviewed By: yaxunl, Anastasia Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D34342 llvm-svn: 318290
* Simplify CpuIs code to use include from LLVMErich Keane2017-11-151-89/+18
| | | | | | | | | | | | LLVM exposes a file in the backend (X86TargetParser.def) that contains information about the correct list of CpuIs values. This patch removes 2 of the copied and pasted versions of this list from clang and instead includes the data from the .def file. Differential Revision: https://reviews.llvm.org/D40054 llvm-svn: 318234
* [PGO] Detect more structural changes with the stable hashVedant Kumar2017-11-141-13/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lifting from Bob Wilson's notes: The hash value that we compute and store in PGO profile data to detect out-of-date profiles does not include enough information. This means that many significant changes to the source will not cause compiler warnings about the profile being out of date, and worse, we may continue to use the outdated profile data to make bad optimization decisions. There is some tension here because some source changes won't affect PGO and we don't want to invalidate the profile unnecessarily. This patch adds a new hashing scheme which is more sensitive to loop nesting, conditions, and out-of-order control flow. Here are examples which show snippets which get the same hash under the current scheme, and different hashes under the new scheme: Loop Nesting Example -------------------- // Snippet 1 while (foo()) { while (bar()) {} } // Snippet 2 while (foo()) {} while (bar()) {} Condition Example ----------------- // Snippet 1 if (foo()) bar(); baz(); // Snippet 2 if (foo()) bar(); else baz(); Out-of-order Control Flow Example --------------------------------- // Snippet 1 while (foo()) { if (bar()) {} baz(); } // Snippet 2 while (foo()) { if (bar()) continue; baz(); } In each of these cases, it's useful to differentiate between the snippets because swapping their profiles gives bad optimization hints. The new hashing scheme considers some logical operators in an effort to detect more changes in conditions. This isn't a perfect scheme. E.g, it does not produce the same hash for these equivalent snippets: // Snippet 1 bool c = !a || b; if (d && e) {} // Snippet 2 bool f = d && e; bool c = !a || b; if (f) {} This would require an expensive data flow analysis. Short of that, the new hashing scheme looks reasonably complete, based on a scan over the statements we place counters on. Profiles which use the old version of the PGO hash remain valid and can be used without issue (there are tests in tree which check this). rdar://17068282 Differential Revision: https://reviews.llvm.org/D39446 llvm-svn: 318229
* Switch -mcount and -finstrument-functions to emit EnterExitInstrumenter ↵Hans Wennborg2017-11-142-37/+18
| | | | | | | | | | | | | | | | attributes This updates -mcount to use the new attribute names (LLVM r318195), and switches over -finstrument-functions to also use these attributes rather than inserting instrumentation in the frontend. It also adds a new flag, -finstrument-functions-after-inlining, which makes the cygprofile instrumentation get inserted after inlining rather than before. Differential Revision: https://reviews.llvm.org/D39331 llvm-svn: 318199
* [NewPassManager] Pass the -fdebug-pass-manager flag setting into the ↵Craig Topper2017-11-141-4/+4
| | | | | | | | | | | | | | | | Analysis managers to match what we do in opt Summary: Currently the -fdebug-pass-manager flag for clang doesn't enable the debug logging in the analysis managers. This is different than what the switch does when passed to opt. Reviewers: chandlerc Reviewed By: chandlerc Subscribers: cfe-commits Differential Revision: https://reviews.llvm.org/D40007 llvm-svn: 318140
* [PM] Wire up support for the bounds checking sanitizer with the new PM.Chandler Carruth2017-11-141-0/+14
| | | | | | | | | | | | | | | Not much interesting here. Mostly wiring things together. One thing worth noting is that the approach is substantially different from the old PM. Here, the -O0 case works fundamentally differently in that we just directly build the pipeline without any callbacks or other cruft. In some ways, this is nice and clean. However, I don't like that it causes the sanitizers to be enabled with different changes at different times. =/ Suggestions for a better way to do this are welcome. Differential Revision: https://reviews.llvm.org/D39085 llvm-svn: 318131
* [PM] Add a missing header that I had in the next commit but was neededChandler Carruth2017-11-141-0/+1
| | | | | | in r318128. Should fix the build. llvm-svn: 318130
* [PM] Port BoundsChecking to the new PM.Chandler Carruth2017-11-141-1/+1
| | | | | | | | | | | Registers it and everything, updates all the references, etc. Next patch will add support to Clang's `-fexperimental-new-pass-manager` path to actually enable BoundsChecking correctly. Differential Revision: https://reviews.llvm.org/D39084 llvm-svn: 318128
* OpenCL: Assume inline asm is convergentMatt Arsenault2017-11-131-4/+5
| | | | | | Already done for CUDA. llvm-svn: 318098
* [CodeGen] fix const-ness of cbrt and fmaSanjay Patel2017-11-131-9/+5
| | | | | | | | | | | cbrt() is always constant because it can't overflow or underflow. Therefore, it can't set errno. fma() is not always constant because it can overflow or underflow. Therefore, it can set errno. But we know that it never sets errno on GNU / MSVC, so make it constant in those environments. Differential Revision: https://reviews.llvm.org/D39641 llvm-svn: 318093
* [clang] Remove redundant return [NFC]Mandeep Singh Grang2017-11-131-1/+0
| | | | | | | | | | | | | | Reviewers: rsmith, sfantao, mcrosier Reviewed By: mcrosier Subscribers: jholewinski, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D39915 llvm-svn: 318074
* [ThinLTO] Handle -fdebug-pass-manager for backend invocations via clangTeresa Johnson2017-11-131-0/+1
| | | | | | | | | | | | | Recommit of r317951 and r317951 along with what I believe should fix the remaining buildbot failures - the target triple should be specified for both the ThinLTO pre-thinlink compile and backend (post-thinlink) compile to ensure it is consistent. Original description: The LTO Config field wasn't being set when invoking a ThinLTO backend via clang (i.e. for distributed builds). llvm-svn: 318042
* [coroutines] Promote cleanup.dest.slot allocas to registers to avoid storing ↵Gor Nishanov2017-11-112-17/+35
| | | | | | | | | | | | | | | | | | | | | | it in the coroutine frame Summary: We don't want to store cleanup dest slot saved into the coroutine frame (as some of the cleanup code may access them after coroutine frame destroyed). This is an alternative to https://reviews.llvm.org/D37093 It is possible to do this for all functions, but, cursory check showed that in -O0, we get slightly longer function (by 1-3 instructions), thus, we are only limiting cleanup.dest.slot elimination to coroutines. Reviewers: rjmccall, hfinkel, eric_niebler Reviewed By: eric_niebler Subscribers: EricWF, cfe-commits Differential Revision: https://reviews.llvm.org/D39768 llvm-svn: 317981
* Add CLANG_DEFAULT_OBJCOPY to allow Clang to use llvm-objcopy for dwarf fissionJake Ehrlich2017-11-112-41/+41
| | | | | | | | | | | | llvm-objcopy is getting to where it can be used in non-trivial ways (such as for dwarf fission in clang). It now supports dwarf fission but this feature hasn't been thoroughly tested yet. This change allows people to optionally build clang to use llvm-objcopy rather than GNU objcopy. By default GNU objcopy is still used so nothing should change. Differential Revision: https://reviews.llvm.org/D39029 llvm-svn: 317960
* Revert "[ThinLTO] Handle -fdebug-pass-manager for backend invocations via clang"Teresa Johnson2017-11-111-1/+0
| | | | | | | This reverts commit r317951 and r317952. The new test is aborting on some bots and I'll need to investigate later. llvm-svn: 317959
* [ThinLTO] Handle -fdebug-pass-manager for backend invocations via clangTeresa Johnson2017-11-101-0/+1
| | | | | | | | | | | | | | Summary: The LTO Config field wasn't being set when invoking a ThinLTO backend via clang (i.e. for distributed builds). Reviewers: danielcdh Subscribers: mehdi_amini, inglorion, eraman, cfe-commits Differential Revision: https://reviews.llvm.org/D39923 llvm-svn: 317951
* Remove declaration of EmitMCountInstrumentation(). NFCHans Wennborg2017-11-101-3/+0
| | | | | | The definition was removed in r280355. llvm-svn: 317944
* [OPENMP] Codegen for `#pragma omp target parallel for simd`.Alexey Bataev2017-11-094-12/+47
| | | | | | Added codegen for `#pragma omp target parallel for simd` and clauses. llvm-svn: 317813
* Fix a bug with the use of __builtin_bzero in a conditional expression.John McCall2017-11-091-1/+1
| | | | | | Patch by Bharathi Seshadri! llvm-svn: 317776
* [Coverage] Emit deferred regions in headersVedant Kumar2017-11-091-3/+5
| | | | | | | | | | | There are some limitations with emitting regions in macro expansions because we don't gather file IDs within the expansions. Fix the check that prevents us from emitting deferred regions in expansions to make an exception for headers, which is something we can handle. rdar://35373009 llvm-svn: 317760
* [Coverage] Complete top-level deferred regions before labelsVedant Kumar2017-11-091-3/+38
| | | | | | | | | | | | | | | | | | | The area immediately after a terminated region in the function top-level should have the same count as the label it precedes. This solves another problem with wrapped segments. Consider: 1| a: 2| return 0; 3| b: 4| return 1; Without a gap area starting after the first return, the wrapped segment from line 2 would make it look like line 3 is executed, when it's not. rdar://35373009 llvm-svn: 317759
* [Coverage] Emit a gap area after if conditionsVedant Kumar2017-11-091-1/+26
| | | | | | | | | | | | | | | | | | | The area immediately after the closing right-paren of an if condition should have a count equal to the 'then' block's count. Use a gap region to set this count, so that region highlighting for the 'then' block remains precise. This solves a problem we have with wrapped segments. Consider: 1| if (false) 2| foo(); Without a gap area starting after the condition, the wrapped segment from line 1 would make it look like line 2 is executed, when it's not. rdar://35373009 llvm-svn: 317758
* [OPENMP] Codegen for `#pragma omp target parallel for`.Alexey Bataev2017-11-084-9/+43
| | | | llvm-svn: 317719
* [NVPTX] Implement __nvvm_atom_add_gen_d builtin.Justin Lebar2017-11-071-0/+10
| | | | | | | | | | | | | | | Summary: This just seems to have been an oversight. We already supported the f64 atomic add with an explicit scope (e.g. "cta"), but not the scopeless version. Reviewers: tra Subscribers: jholewinski, sanjoy, cfe-commits, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D39638 llvm-svn: 317623
* New clang option -fno-plt which avoids the PLT and lazy binding while making ↵Sriraman Tallam2017-11-071-0/+10
| | | | | | | | external calls. Differential Revision: https://reviews.llvm.org/D39079 llvm-svn: 317605
* Clang/libomptarget map interface flag renaming - NFC patchGeorge Rokos2017-11-071-41/+38
| | | | | | | | | | This patch renames some of the flag names of the clang/libomptarget map interface. The old names are slightly misleading, whereas the new ones describe in a better way what each flag is about. Only the macros within the enumeration are renamed, there is no change in functionality therefore there are no updated regression tests. Differential Revision: https://reviews.llvm.org/D39745 llvm-svn: 317598
* SystemZ Swift TargetInfo: swifterror support in the backend is brokenArnold Schwaighofer2017-11-071-1/+1
| | | | | | Return false for swifterror support until the backend is fixed. llvm-svn: 317589
* [X86] Replace the mask cmpeq/cmple/cmplt/cmpgt/cmpge/cmpneq intrinsics with ↵Craig Topper2017-11-061-26/+0
| | | | | | | | macros that just pass the right comparison predicate value to the regular cmp intrinsic. Remove mask cmpeq/cmpgt builtins that are now unused. This shortens the intrinsic headers a little and allows us to get rid of the cmpeq and cmpgt handling from CGBuiltin.cpp. llvm-svn: 317506
* [CodeGen] match new fast-math-flag method: isFast()Sanjay Patel2017-11-061-1/+1
| | | | | | | | This corresponds to LLVM commiti r317488: If that commit is reverted, this commit will also need to be reverted. llvm-svn: 317489
* CodeGenCXX: no default dllimport storage for mingwMartell Malone2017-11-041-1/+2
| | | | | | | | | | | GNU frontends don't have options like /MT, /MD This fixes a few link error regressions with libc++ and libc++abi Reviewers: rnk, mstorsjo, compnerd Differential Revision: https://reviews.llvm.org/D33620 llvm-svn: 317398
* [c++17] Visit class template explicit specializations just like all other ↵Richard Smith2017-11-031-9/+7
| | | | | | | | class definitions in codegen. If an explicit specialization has a static data member, it may be a definition and we may need to register it for emission. llvm-svn: 317296
* [OPENMP] Fix PR35152: Do not use getInvokeDest() function for EH checks.Alexey Bataev2017-11-021-1/+2
| | | | | | | The compiler may crash under some conditions if the getInvokeDest() is used, but later it is not used. Fixed this problem in OpenMP. llvm-svn: 317227
* [OPENMP] Fix PR35156: Get correct thread id with windows exceptions.Alexey Bataev2017-11-021-6/+8
| | | | | | | If the thread id is requested in windows mode within funclets, we may generate incorrect function call that could lead to broken codegen. llvm-svn: 317208
* CodeGen: simplify EH personality selection (NFC)Saleem Abdulrasool2017-11-021-8/+9
| | | | | | | Fix a typo in the comment, reorder to ensure that the ordering matches across the ObjC/ObjC++ cases. NFCI. llvm-svn: 317146
* Fix for PR33930. Short-circuit metadata mapping when cloning a varargs thunk.Wolfgang Pieb2017-10-311-1/+33
| | | | | | | | | The cloning happens before all metadata nodes are resolved. Prevent the value mapper from running into unresolved or temporary MD nodes. Differential Revision: https://reviews.llvm.org/D39396 llvm-svn: 317047
* [CFI] Add CFI-icall pointer type generalizationVlad Tsyrklevich2017-10-313-2/+69
| | | | | | | | | | | | | | | | | | | | | | | Summary: This change allows generalizing pointers in type signatures used for cfi-icall by enabling the -fsanitize-cfi-icall-generalize-pointers flag. This works by 1) emitting an additional generalized type signature metadata node for functions and 2) llvm.type.test()ing for the generalized type for translation units with the flag specified. This flag is incompatible with -fsanitize-cfi-cross-dso because it would require emitting twice as many type hashes which would increase artifact size. Reviewers: pcc, eugenis Reviewed By: pcc Subscribers: kcc Differential Revision: https://reviews.llvm.org/D39358 llvm-svn: 317044
* [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not setSanjay Patel2017-10-311-16/+13
| | | | | | | | | | | | | | | | | The LLVM sqrt intrinsic definition changed with: D28797 ...so we don't have to use any relaxed FP settings other than errno handling. This patch sidesteps a question raised in PR27435: https://bugs.llvm.org/show_bug.cgi?id=27435 Is a programmer using __builtin_sqrt() invoking the compiler's intrinsic definition of sqrt or the mathlib definition of sqrt? But we have an answer now: the builtin should match the behavior of the libm function including errno handling. Differential Revision: https://reviews.llvm.org/D39204 llvm-svn: 317031
* [CodeGen] Propagate may-alias'ness of lvalues with TBAA infoIvan A. Kosarev2017-10-3110-112/+173
| | | | | | | | | | | | | This patch fixes various places in clang to propagate may-alias TBAA access descriptors during construction of lvalues, thus eliminating the need for the LValueBaseInfo::MayAlias flag. This is part of D38126 reworked to be a separate patch to simplify review. Differential Revision: https://reviews.llvm.org/D39008 llvm-svn: 316988
* CodeGen: Fix insertion position of addrspace cast for allocaYaxun Liu2017-10-301-1/+5
| | | | | | | | | | | | | | | | | | | | For non-zero alloca addr space, alloca is usually casted to default addr space immediately. For non-vla, alloca is inserted at AllocaInsertPt, therefore the addr space cast should also be insterted at AllocaInsertPt. However, for vla, alloca is inserted at the current insertion point of IRBuilder, therefore the addr space cast should also inserted at the current insertion point of IRBuilder. Currently clang always insert addr space cast at AllocaInsertPt, which causes invalid IR. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39374 llvm-svn: 316909
* [CodeGen] Generate TBAA info for reference loadsIvan A. Kosarev2017-10-305-58/+62
| | | | | | Differential Revision: https://reviews.llvm.org/D39177 llvm-svn: 316896
* Replace a few usages of llvm::join with range-version[NFC]Erich Keane2017-10-271-3/+3
| | | | | | | | I noticed a few usages of llvm::join that were using begin/end rather than just the range version. This patch just replaces those. llvm-svn: 316784
OpenPOWER on IntegriCloud