summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Relax the rules around objc_alloc and objc_alloc_init optimizations.Pierre Habouzit2020-01-141-21/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today the optimization is limited to: - `[ClassName alloc]` - `[self alloc]` when within a class method However it means that when code is written this way: ``` @interface MyObject - (id)copyWithZone:(NSZone *)zone { return [[self.class alloc] _initWith...]; } @end ``` ... then the optimization doesn't kick in and `+[NSObject alloc]` ends up in IMP caches where it could have been avoided. It turns out that `+alloc` -> `+[NSObject alloc]` is the most cached SEL/IMP pair in the entire platform which is rather silly). There's two theoretical risks allowing this optimization: 1. if the receiver is nil (which it can't be today), but it turns out that `objc_alloc()`/`objc_alloc_init()` cope with a nil receiver, 2. if the `Clas` type for the receiver is a lie. However, for such a code to work today (and not fail witn an unrecognized selector anyway) you'd have to have implemented the `-alloc` **instance method**. Fortunately, `objc_alloc()` doesn't assume that the receiver is a Class, it basically starts with a test that is similar to `if (receiver->isa->bits & hasDefaultAWZ) { /* fastpath */ }`. This bit is only set on metaclasses by the runtime, so if an instance is passed to this function by accident, its isa will fail this test, and `objc_alloc()` will gracefully fallback to `objc_msgSend()`. The one thing `objc_alloc()` doesn't support is tagged pointer instances. None of the tagged pointer classes implement an instance method called `'alloc'` (actually there's a single class in the entire Apple codebase that has such a method). Differential Revision: https://reviews.llvm.org/D71682 Radar-Id: rdar://problem/58058316 Reviewed-By: Akira Hatanaka Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
* [X86] ABI compat bugfix for MSVC vectorcallReid Kleckner2020-01-141-75/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this change, X86_32ABIInfo::classifyArgument would be called twice on vector arguments to vectorcall functions. This function has side effects to track GPR register usage, and this would lead to incorrect GPR usage in some cases. The specific case I noticed is from running out of XMM registers with mixed FP and vector arguments and no aggregates of any kind. Consider this prototype: void __vectorcall vectorcall_indirect_vec( double xmm0, double xmm1, double xmm2, double xmm3, double xmm4, __m128 xmm5, __m128 ecx, int edx, __m128 mem); classifyArgument has no effects when called on a plain FP type, but when called on a vector type, it modifies FreeRegs to model GPR consumption. However, this should not happen during the vector call first pass. I refactored the code to unify vectorcall HVA logic with regcall HVA logic. The conventions pass HVAs in registers differently (expanded vs. not expanded), but if they do not fit in registers, they both pass them indirectly by address. Reviewers: erichkeane, craig.topper Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72110
* [remark][diagnostics] Using clang diagnostic handler for IR input filesRong Xu2020-01-141-17/+76
| | | | | | | | | | | | | | | | | | | | | | | | For IR input files, we currently use LLVM diagnostic handler even the compilation is from clang. As a result, we are not able to use -Rpass to get the transformation reports. Some warnings are not handled properly either: We found many mysterious warnings in our ThinLTO backend compilations in SamplePGO and CSPGO. An example of the warning: "warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)" This turns out to be a warning by Wmisexpect, which is supposed to be filtered out by default. But since the filter is in clang's diagnostic hander, we emit these incomplete warnings from LLVM's diagnostic handler. This patch uses clang diagnostic handler for IR input files. We create a fake backendconsumer just to install the diagnostic handler. With this change, we will have proper handling of all the warnings and we can use -Rpass* options in IR input files compilation. Also note that with is patch, LLVM's diagnostic options, like "-mllvm -pass-remarks=*", are no longer be able to get optimization remarks. Differential Revision: https://reviews.llvm.org/D72523
* [OPENMP]Do not emit special virtual function for NVPTX target.Alexey Bataev2020-01-141-1/+6
| | | | | | There are no special virtual function handlers (like __cxa_pure_virtual) defined for NVPTX target, so just emit such functions as null pointers to prevent issues with linking and unresolved references.
* [DebugInfo] Add option to clang to limit debug info that is emitted for classes.Amy Huang2020-01-141-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds an option to limit debug info by only emitting complete class type information when its constructor is emitted. This applies to classes that have nontrivial user defined constructors. I implemented the option by adding another level to `DebugInfoKind`, and a flag `-flimit-debug-info-constructor`. Total object file size on Windows, compiling with RelWithDebInfo: before: 4,257,448 kb after: 2,104,963 kb And on Linux before: 9,225,140 kb after: 4,387,464 kb According to the Windows clang.pdb files, here is a list of types that are no longer complete with this option enabled: https://reviews.llvm.org/P8182 Reviewers: rnk, dblaikie Subscribers: aprantl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72427
* Fix "pointer is null" static analyzer warnings. NFCI.Simon Pilgrim2020-01-141-4/+2
| | | | Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
* Make helper functions static or move them into anonymous namespaces. NFC.Benjamin Kramer2020-01-142-2/+6
|
* [RISCV] Fix ILP32D lowering for double+double/double+int return typesJames Clarke2020-01-141-5/+15
| | | | | | | | | | | | | | | | | | Summary: Previously, since these aggregates are > 2*XLen, Clang would think they were being returned indirectly and thus would decrease the number of available GPRs available by 1. For long argument lists this could lead to a struct argument incorrectly being passed indirectly. Reviewers: asb, lenary Reviewed By: asb, lenary Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69590
* [DebugInfo] Add another level to DebugInfoKind called ConstructorAmy Huang2020-01-137-35/+28
| | | | | | | | | The option will limit debug info by only emitting complete class type information when its constructor is emitted. This patch changes comparisons with LimitedDebugInfo to use the new level instead. Differential Revision: https://reviews.llvm.org/D72427
* [ItaniumCXXABI] Make tls wrappers properly comdatMartin Storsjö2020-01-131-0/+3
| | | | | | | | | | | | Just marking a symbol as weak_odr/linkonce_odr isn't enough for actually tolerating multiple copies of it at linking on windows, it has to be made a proper comdat; make it comdat for all platforms for consistency. This should hopefully fix https://bugzilla.mozilla.org/show_bug.cgi?id=1566288. Differential Revision: https://reviews.llvm.org/D71572
* Implement VectorType conditional operator GNU extension.Erich Keane2020-01-131-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC supports the conditional operator on VectorTypes that acts as a 'select' in C++ mode. This patch implements the support. Types are converted as closely to GCC's behavior as possible, though in a few places consistency with our existing vector type support was preferred. Note that this implementation is different from the OpenCL version in a number of ways, so it unfortunately required a different implementation. First, the SEMA rules and promotion rules are significantly different. Secondly, GCC implements COND[i] != 0 ? LHS[i] : RHS[i] (where i is in the range 0- VectorSize, for each element). In OpenCL, the condition is COND[i] < 0 ? LHS[i]: RHS[i]. In the process of implementing this, it was also required to make the expression COND ? LHS : RHS type dependent if COND is type dependent, since the type is now dependent on the condition. For example: T ? 1 : 2; Is not typically type dependent, since the result can be deduced from the operands. HOWEVER, if T is a VectorType now, it could change this to a 'select' (basically a swizzle with a non-constant mask) with the 1 and 2 being promoted to vectors themselves. While this is a change, it is NOT a standards incompatible change. Based on my (and D. Gregor's, at the time of writing the code) reading of the standard, the expression is supposed to be type dependent if ANY sub-expression is type dependent. Differential Revision: https://reviews.llvm.org/D71463
* This option allows selecting the TLS size in the local exec TLS model,KAWASHIMA Takahiro2020-01-131-0/+1
| | | | | | | | | | | | | | | | | | which is the default TLS model for non-PIC objects. This allows large/ many thread local variables or a compact/fast code in an executable. Specification is same as that of GCC. For example, the code model option precedes the TLS size option. TLS access models other than local-exec are not changed. It means supoort of the large code model is only in the local exec TLS model. Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>) Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard Reviewd By: peter.smith Committed by: peter.smith Differential Revision: https://reviews.llvm.org/D71688
* Revert "[DWARF5][clang]: Added support for DebugInfo generation for auto ↵Sam McCall2020-01-132-21/+11
| | | | | | | | | return type for C++ member functions." This reverts commit 6d6a4590c5d4c7fc7445d72fe685f966b0a8cafb, which introduces a crash. See https://reviews.llvm.org/D70524 for details.
* [DWARF5][clang]: Added support for DebugInfo generation for auto return type ↵Awanish Pandey2020-01-132-11/+21
| | | | | | | | | | | | | | | | | for C++ member functions. Summary: This patch will provide support for auto return type for the C++ member functions. This patch includes clang side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524
* Fix "pointer is null" static analyzer warning. NFCI.Simon Pilgrim2020-01-111-1/+1
| | | | Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately).
* Fix "pointer is null" static analyzer warnings. NFCI.Simon Pilgrim2020-01-113-17/+10
| | | | Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
* Only destroy static locals if they have non-trivial destructors.Richard Smith2020-01-101-1/+2
| | | | | | | This fixes a regression introduced in 2b4fa5348ee157b6b1a1af44d0137ca8c7a71573 that caused us to emit shutdown-time destruction for variables with ARC ownership, using C++-specific functions that don't exist in C implementations.
* [Driver][CodeGen] Add -fpatchable-function-entry=N[,0]Fangrui Song2020-01-101-0/+3
| | | | | | | | | | | In the backend, this feature is implemented with the function attribute "patchable-function-entry". Both the attribute and XRay use TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are incompatible. Reviewed By: ostannard, MaskRay Differential Revision: https://reviews.llvm.org/D72222
* Support function attribute patchable_function_entryFangrui Song2020-01-101-1/+7
| | | | | | | | | This feature is generic. Make it applicable for AArch64 and X86 because the backend has only implemented NOP insertion for AArch64 and X86. Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D72221
* Fix "pointer is null" static analyzer warning. NFCI.Simon Pilgrim2020-01-101-1/+1
| | | | Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
* Add support for __declspec(guard(nocf))Andrew Paverd2020-01-101-0/+11
| | | | | | | | | | | | | | | | | | Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167
* [FPEnv] Generate constrained FP comparisons from clangUlrich Weigand2020-01-101-11/+17
| | | | | | | | | | | | | | | | | | | | | Update the IRBuilder to generate constrained FP comparisons in CreateFCmp when IsFPConstrained is true, similar to the other places in the IRBuilder. Also, add a new CreateFCmpS to emit signaling FP comparisons, and use it in clang where comparisons are supposed to be signaling (currently, only when emitting code for the <, <=, >, >= operators). Note that there is currently no way to add fast-math flags to a constrained FP comparison, since this is implemented as an intrinsic call that returns a boolean type, and FMF are only allowed for calls returning a floating-point type. However, given the discussion around https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself really shouldn't have any FMF either, so this is probably OK. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71467
* Allow system header to provide their own implementation of some builtinserge-sans-paille2020-01-102-1/+13
| | | | | | | | | | | | | | | | | | If a system header provides an (inline) implementation of some of their function, clang still matches on the function name and generate the appropriate llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation « users may not provide their own version of symbols » but doesn't account for the fact that glibc itself can provide inline version of some functions. It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that case an inline version of memcpy calls __memcpy_chk, a function that performs extra runtime checks. Clang currently ignores the inline version and thus provides no runtime check. This code fixes the issue by detecting functions whose name is a builtin name but also have an inline implementation. Differential Revision: https://reviews.llvm.org/D71082
* [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLPWei Mi2020-01-091-0/+6
| | | | | | | | | | | | | down to pass builder in ltobackend. Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang are not passed down to pass builder in ltobackend when new pass manager is used. This is inconsistent with the behavior when new pass manager is used and thinlto is not used. Such inconsistency causes slp vectorization pass not being enabled in ltobackend for O3 + thinlto right now. This patch fixes that. Differential Revision: https://reviews.llvm.org/D72386
* Add builtins for aligning and checking alignment of pointers and integersAlex Richardson2020-01-092-0/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change introduces three new builtins (which work on both pointers and integers) that can be used instead of common bitwise arithmetic: __builtin_align_up(x, alignment), __builtin_align_down(x, alignment) and __builtin_is_aligned(x, alignment). I originally added these builtins to the CHERI fork of LLVM a few years ago to handle the slightly different C semantics that we use for CHERI [1]. Until recently these builtins (or sequences of other builtins) were required to generate correct code. I have since made changes to the default C semantics so that they are no longer strictly necessary (but using them does generate slightly more efficient code). However, based on our experience using them in various projects over the past few years, I believe that adding these builtins to clang would be useful. These builtins have the following benefit over bit-manipulation and casts via uintptr_t: - The named builtins clearly convey the semantics of the operation. While checking alignment using __builtin_is_aligned(x, 16) versus ((x & 15) == 0) is probably not a huge win in readably, I personally find __builtin_align_up(x, N) a lot easier to read than (x+(N-1))&~(N-1). - They preserve the type of the argument (including const qualifiers). When using casts via uintptr_t, it is easy to cast to the wrong type or strip qualifiers such as const. - If the alignment argument is a constant value, clang can check that it is a power-of-two and within the range of the type. Since the semantics of these builtins is well defined compared to arbitrary bit-manipulation, it is possible to add a UBSAN checker that the run-time value is a valid power-of-two. I intend to add this as a follow-up to this change. - The builtins avoids int-to-pointer casts both in C and LLVM IR. In the future (i.e. once most optimizations handle it), we could use the new llvm.ptrmask intrinsic to avoid the ptrtoint instruction that would normally be generated. - They can be used to round up/down to the next aligned value for both integers and pointers without requiring two separate macros. - In many projects the alignment operations are already wrapped in macros (e.g. roundup2 and rounddown2 in FreeBSD), so by replacing the macro implementation with a builtin call, we get improved diagnostics for many call-sites while only having to change a few lines. - Finally, the builtins also emit assume_aligned metadata when used on pointers. This can improve code generation compared to the uintptr_t casts. [1] In our CHERI compiler we have compilation mode where all pointers are implemented as capabilities (essentially unforgeable 128-bit fat pointers). In our original model, casts from uintptr_t (which is a 128-bit capability) to an integer value returned the "offset" of the capability (i.e. the difference between the virtual address and the base of the allocation). This causes problems for cases such as checking the alignment: for example, the expression `if ((uintptr_t)ptr & 63) == 0` is generally used to check if the pointer is aligned to a multiple of 64 bytes. The problem with offsets is that any pointer to the beginning of an allocation will have an offset of zero, so this check always succeeds in that case (even if the address is not correctly aligned). The same issues also exist when aligning up or down. Using the alignment builtins ensures that the address is used instead of the offset. While I have since changed the default C semantics to return the address instead of the offset when casting, this offset compilation mode can still be used by passing a command-line flag. Reviewers: rsmith, aaron.ballman, theraven, fhahn, lebedev.ri, nlopes, aqjune Reviewed By: aaron.ballman, lebedev.ri Differential Revision: https://reviews.llvm.org/D71499
* [Clang] Handle target-specific builtins returning aggregates.Simon Tatham2020-01-091-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A few of the ARM MVE builtins directly return a structure type. This causes an assertion failure at code-gen time if you try to assign the result of the builtin to a variable, because the `RValue` created in `EmitBuiltinExpr` from the `llvm::Value` produced by codegen is always made by `RValue::get()`, which creates a non-aggregate `RValue` that will fail an assertion when `AggExprEmitter::withReturnValueSlot` calls `Src.getAggregatePointer()`. A similar failure occurs if you try to use the struct return value directly to extract one field, e.g. `vld2q(address).val[0]`. The existing code-gen tests for those MVE builtins pass the returned structure type directly to the C `return` statement, which apparently managed to avoid that particular code path, so we didn't notice the crash. Now `EmitBuiltinExpr` checks the evaluation kind of the builtin's return value, and does the necessary handling for aggregate returns. I've added two extra test cases, both of which crashed before this change. Reviewers: dmgreen, rjmccall Reviewed By: rjmccall Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72271
* Improve support of GNU mempcpyserge-sans-paille2020-01-091-2/+8
| | | | | | | - Lower to the memcpy intrinsic - Raise warnings when size/bounds are known Differential Revision: https://reviews.llvm.org/D71374
* [OPENMP]Remove unused code, NFC.Alexey Bataev2020-01-092-102/+2
|
* Fix "pointer is null" static analyzer warnings. NFCI.Simon Pilgrim2020-01-091-0/+2
| | | | Assert that the pointers are non-null before dereferencing them.
* Fix MSVC unhandled enum warning. NFCI.Simon Pilgrim2020-01-091-0/+1
|
* Fix "pointer is null" static analyzer warning. NFCI.Simon Pilgrim2020-01-081-2/+2
| | | | Use castAs<> instead of getAs<> since we know that the pointer will be valid (and is dereferenced immediately below).
* [OPENMP]Reduce calls for the mangled names.Alexey Bataev2020-01-072-8/+9
| | | | | | Use canonical decls instead of mangled names in the set of already emitted decls. This allows to reduce the number of function calls for getting declarations mangled names and speedup the compilation.
* [HIP] Add option --gpu-max-threads-per-block=nYaxun (Sam) Liu2020-01-071-2/+5
| | | | | | Add this option to change the default launch bounds. Differential Revision: https://reviews.llvm.org/D71221
* [NFC] Use isX86() instead of getArch()Jim Lin2020-01-072-6/+2
| | | | | | | | | | | | | | Summary: This is a clean up for https://reviews.llvm.org/D72247. Reviewers: MaskRay, craig.topper, jhenderson Reviewed By: MaskRay Subscribers: hiraditya, rupprecht, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72320
* [CodeGen][ObjC] Push the properties of a protocol before pushing theAkira Hatanaka2020-01-061-3/+3
| | | | | | | | | | properties of the protocol it inherits This fixes a bug where the type string for a @dynamic property of an @implementation didn't have 'D' in it when the protocol it conforms to redeclares the property declared in the base protocol. rdar://problem/45503561
* Add Triple::isX86()Fangrui Song2020-01-061-2/+1
| | | | | | Reviewed By: craig.topper, skan Differential Revision: https://reviews.llvm.org/D72247
* [OPENMP50]Support lastprivate conditional updates in inc/dec unary ops.Alexey Bataev2020-01-065-13/+58
| | | | | Added support for checking of updates of variables used in unary pre(pos) inc/dec expressions.
* [OPENMP50]Codegen for lastprivate conditional list items.Alexey Bataev2020-01-026-3/+357
| | | | | | | | | | | | | | | | | | | | | | | | | Added codegen support for lastprivate conditional. According to the standard, if when the conditional modifier appears on the clause, if an assignment to a list item is encountered in the construct then the original list item is assigned the value that is assigned to the new list item in the sequentially last iteration or lexically last section in which such an assignment is encountered. We look for the assignment operations and check if the left side references lastprivate conditional variable. Then the next code is emitted: if (last_iv_a <= iv) { last_iv_a = iv; last_a = lp_a; } At the end the implicit barrier is generated to wait for the end of all threads and then in the check for the last iteration the private copy is assigned the last value. if (last_iter) { lp_a = last_a; // <--- new code a = lp_a; // <--- store of private value to the original variable. }
* [SystemZ] Use FNeg in s390x clang builtinsKevin P. Neal2020-01-021-9/+5
| | | | The s390x builtins are still using FSub instead of FNeg. Correct that.
* Generalize the pass registration mechanism used by Polly to any third-party toolserge_sans_paille2020-01-022-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | There's quite a lot of references to Polly in the LLVM CMake codebase. However the registration pattern used by Polly could be useful to other external projects: thanks to that mechanism it would be possible to develop LLVM extension without touching the LLVM code base. This patch has two effects: 1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it with a generic mechanism 2. Provide a generic mechanism to register compiler extensions. A compiler extension is similar to a pass plugin, with the notable difference that the compiler extension can be configured to be built dynamically (like plugins) or statically (like regular passes). As a result, people willing to add extra passes to clang/opt can do it using a separate code repo, but still have their pass be linked in clang/opt as built-in passes. Differential Revision: https://reviews.llvm.org/D61446
* [NFC] Fixes -Wrange-loop-analysis warningsMark de Wever2020-01-012-2/+2
| | | | | | This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857
* [MC][TargetMachine] Delete MCTargetOptions::MCPIECopyRelocationsFangrui Song2020-01-011-1/+0
| | | | | | | | | | | | clang/lib/CodeGen/CodeGenModule performs the -mpie-copy-relocations check and sets dso_local on applicable global variables. We don't need to duplicate the work in TargetMachine shouldAssumeDSOLocal. Verified that -mpie-copy-relocations can still emit PC relative relocations for external variable accesses. clang -target x86_64 -fpie -mpie-copy-relocations -c => R_X86_64_PC32 clang -target aarch64 -fpie -mpie-copy-relocations -c => R_AARCH64_ADR_PREL_PG_HI21+R_AARCH64_LDST64_ABS_LO12_NC
* [OPENMP]Emit artificial threprivate vars as threadlocal, if possible.Alexey Bataev2019-12-311-2/+7
| | | | It may improve performance for declare reduction constructs.
* [CodeGen] Emit conj/conjf/confjl libcalls as fneg instructions if possible.Craig Topper2019-12-311-1/+4
| | | | | | | We already recognize the __builtin versions of these, might as well recognize the libcall version. Differential Revision: https://reviews.llvm.org/D72028
* [CodeGen] Use IRBuilder::CreateFNeg for __builtin_conjCraig Topper2019-12-301-6/+1
| | | | | | This replaces the fsub -0.0 idiom with an fneg instruction. We didn't see to have a test that showed the current codegen. Just some tests for constant folding and a test that was only checking the declare lines for libcalls. The latter just checked that we did not have a declare for @conj when using __builtin_conj. Differential Revision: https://reviews.llvm.org/D72012
* [CodeGen] Use CreateFNeg in buildFMulAddCraig Topper2019-12-301-11/+4
| | | | | | We have an fneg instruction now and should use it instead of the fsub -0.0 idiom. Looks like we had no test that showed that we handled the negation cases here so I've added new tests. Differential Revision: https://reviews.llvm.org/D72010
* [OpenMP] Use the OpenMPIRBuilder for `omp parallel`Johannes Doerfert2019-12-301-0/+95
| | | | | | | | | | | | This allows to use the OpenMPIRBuilder for parallel regions. Code was extracted from D61953 and adapted to work with the new version (D70109). All but one feature should be supported. An update of this patch will provide test coverage and privatization other than shared. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D70290
* [OpenMP][NFCI] Use the libFrontend ProcBindKind in ClangJohannes Doerfert2019-12-264-30/+9
| | | | | | | | | | | | This removes the OpenMPProcBindClauseKind enum in favor of llvm::omp::ProcBindKind which lives in OpenMPConstants.h and was introduced in D70109. No change in behavior is expected. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D70289
* [OpenMP][IR-Builder] Introduce the finalization stackJohannes Doerfert2019-12-251-15/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a permanent and generic solution to the problem of variable finalization (destructors, lastprivate, ...), this patch introduces the finalization stack. The objects on the stack describe (1) the (structured) regions the OpenMP-IR-Builder is currently constructing, (2) if these are cancellable, and (3) the callback that will perform the finalization (=cleanup) when necessary. As the finalization can be necessary multiple times, at different source locations, the callback takes the position at which code is currently generated. This position will also encode the destination of the "region exit" block *iff* the finalization call was issues for a region generated by the OpenMPIRBuilder. For regions generated through the old Clang OpenMP code geneneration, the "region exit" is determined by Clang inside the finalization call instead (see getOMPCancelDestination). As a first user, the parallel + cancel barrier interaction is changed. In contrast to the temporary solution before, the barrier generation in Clang does not need to be aware of the "CancelDestination" block. Instead, the finalization callback is and, as described above, later even that one does not need to be. D70109 will be updated to use this scheme. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D70258
* [NFC] Remove some dead code from CGBuiltin.cpp.Kevin P. Neal2019-12-241-14/+0
|
OpenPOWER on IntegriCloud