summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [RISCV] Fix passing two floating-point values in complex separately by two ↵Jim Lin2020-06-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | GPRs on RV64 Summary: This patch fixed the error of counting the remaining FPRs. Complex floating-point values should be passed by two FPRs for the hard-float ABI. If no two FPRs are available, it should be passed via a 64-bit GPR (fp+fp). `ArgFPRsLeft` is only decreased one while the type is complex floating-point. It causes two floating-point values in the complex are passed separately by two GPRs. Reviewers: asb, luismarques, lenary Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, evandro, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79770 (cherry picked from commit 7ee479a760e0a4402b4eb7fb6168768a44f66945)
* Update compiler extension integration into the build systemserge-sans-paille2020-06-171-2/+1
| | | | | | | | | | | | The approach here is to create a new (empty) component, `Extensions', where all statically compiled extensions dynamically register their dependencies. That way we're more natively compatible with LLVMBuild and llvm-config. Fixes: https://bugs.llvm.org/show_bug.cgi?id=44870 Differential Revision: https://reviews.llvm.org/D78192 (cherry picked from commit 8f766e382b77eef3102798b49e087d1e4804b984)
* [CodeGen] fix inline builtin-related breakage from D78162George Burgess IV2020-05-071-3/+9
| | | | | | | | | | | In cases where we have multiple decls of an inline builtin, we may need to go hunting for the one with a definition when setting function attributes. An additional test-case was provided on https://github.com/ClangBuiltLinux/linux/issues/979 (cherry picked from commit 94908088a831141cfbdd15fc5837dccf30cfeeb6)
* [CodeGen] only add nobuiltin to inline builtins if we'll emit themGeorge Burgess IV2020-05-071-1/+2
| | | | | | | | | | | | | | There are some inline builtin definitions that we can't emit (isTriviallyRecursive & callers go into why). Marking these nobuiltin is only useful if we actually emit the body, so don't mark these as such unless we _do_ plan on emitting that. This suboptimality was encountered in Linux (see some discussion on D71082, and https://github.com/ClangBuiltLinux/linux/issues/979). Differential Revision: https://reviews.llvm.org/D78162 (cherry picked from commit 2dd17ff08165e6118e70f00e22b2c36d2d4e0a9a)
* Use FinishThunk to finish musttail thunksReid Kleckner2020-04-131-2/+3
| | | | | | | | | | | | | | | | | FinishThunk, and the invariant of setting and then unsetting CurCodeDecl, was added in 7f416cc42638 (2015). The invariant didn't exist when I added this musttail codepath in ab2090d10765 (2014). Recently in 28328c3771, I started using this codepath on non-Windows platforms, and users reported problems during release testing (PR44987). The issue was already present for users of EH on i686-windows-msvc, so I added a test for that case as well. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D76444 (cherry picked from commit ce5173c0e174870934d1b3a026f631d996136191)
* [remark][diagnostics] [codegen] Fix PR44896Rong Xu2020-02-261-0/+3
| | | | | | | | | | | | | | | | | This patch fixes PR44896. For IR input files, option fdiscard-value-names should be ignored as we need named values in loadModule(). Commit 60d3947922 sets this option after loadModule() where valued names already created. This creates an inconsistent state in setNameImpl() that leads to a seg fault. This patch forces fdiscard-value-names to be false for IR input files. This patch also emits a warning of "ignoring -fdiscard-value-names" if option fdiscard-value-names is explictly enabled in the commandline for IR input files. Differential Revision: https://reviews.llvm.org/D74878 (cherry picked from commit 11857d49948b845dcfd7c7f78595095e3add012d)
* [RISCV] Pass target-abi via module flag metadataZakk Chen2020-01-271-0/+7
| | | | | | | | | | | | Reviewers: lenary, asb Reviewed By: lenary Tags: #clang Differential Revision: https://reviews.llvm.org/D72755 (cherry picked from commit e15fb06e2d0a068de549464d72081811e7fac612)
* [Driver][CodeGen] Support -fpatchable-function-entry=N,M and ↵Fangrui Song2020-01-241-6/+11
| | | | | | | | | | __attribute__((patchable_function_entry(N,M))) where M>0 Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73072 (cherry picked from commit 69bf40c45fd7f6dfe11b47de42571d8bff5ef94f)
* [Concepts] Requires ExpressionsSaar Raz2020-01-242-0/+5
| | | | | | | | | | Implement support for C++2a requires-expressions. Re-commit after compilation failure on some platforms due to alignment issues with PointerIntPair. Differential Revision: https://reviews.llvm.org/D50360 (cherry picked from commit a0f50d731639350c7a79f140f026c27a18215531)
* Revert 9007f06af0e "Revert "Allow system header to provide their own ↵Hans Wennborg2020-01-172-1/+13
| | | | | | | implementation of some builtin"" This should no longer be necessary after cd4c65f91d5 "Add __warn_memset_zero_len builtin as a workaround for glibc issue"
* Revert "Allow system header to provide their own implementation of some builtin"Amy Huang2020-01-172-13/+1
| | | | | | | | | This reverts commit 921f871ac438175ca8fcfcafdfcfac4d7ddf3905 because it causes libc++ code to trigger __warn_memset_zero_len. See https://reviews.llvm.org/D71082. (cherry picked from commit 3d210ed3d1880c615776b07d1916edb400c245a6)
* Add __warn_memset_zero_len builtin as a workaround for glibc issueserge-sans-paille2020-01-171-0/+2
| | | | | | | | | Glibc issue: https://sourceware.org/bugzilla/show_bug.cgi?id=25399 The fix consist in considering the missing function as a builtin lowered to a nop. Differential Revision: https://reviews.llvm.org/D72869 (cherry picked from commit d293417931d3a9d46799b42795988ca3b5cfd766)
* Relax the rules around objc_alloc and objc_alloc_init optimizations.Pierre Habouzit2020-01-141-21/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today the optimization is limited to: - `[ClassName alloc]` - `[self alloc]` when within a class method However it means that when code is written this way: ``` @interface MyObject - (id)copyWithZone:(NSZone *)zone { return [[self.class alloc] _initWith...]; } @end ``` ... then the optimization doesn't kick in and `+[NSObject alloc]` ends up in IMP caches where it could have been avoided. It turns out that `+alloc` -> `+[NSObject alloc]` is the most cached SEL/IMP pair in the entire platform which is rather silly). There's two theoretical risks allowing this optimization: 1. if the receiver is nil (which it can't be today), but it turns out that `objc_alloc()`/`objc_alloc_init()` cope with a nil receiver, 2. if the `Clas` type for the receiver is a lie. However, for such a code to work today (and not fail witn an unrecognized selector anyway) you'd have to have implemented the `-alloc` **instance method**. Fortunately, `objc_alloc()` doesn't assume that the receiver is a Class, it basically starts with a test that is similar to `if (receiver->isa->bits & hasDefaultAWZ) { /* fastpath */ }`. This bit is only set on metaclasses by the runtime, so if an instance is passed to this function by accident, its isa will fail this test, and `objc_alloc()` will gracefully fallback to `objc_msgSend()`. The one thing `objc_alloc()` doesn't support is tagged pointer instances. None of the tagged pointer classes implement an instance method called `'alloc'` (actually there's a single class in the entire Apple codebase that has such a method). Differential Revision: https://reviews.llvm.org/D71682 Radar-Id: rdar://problem/58058316 Reviewed-By: Akira Hatanaka Signed-off-by: Pierre Habouzit <phabouzit@apple.com>
* [X86] ABI compat bugfix for MSVC vectorcallReid Kleckner2020-01-141-75/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this change, X86_32ABIInfo::classifyArgument would be called twice on vector arguments to vectorcall functions. This function has side effects to track GPR register usage, and this would lead to incorrect GPR usage in some cases. The specific case I noticed is from running out of XMM registers with mixed FP and vector arguments and no aggregates of any kind. Consider this prototype: void __vectorcall vectorcall_indirect_vec( double xmm0, double xmm1, double xmm2, double xmm3, double xmm4, __m128 xmm5, __m128 ecx, int edx, __m128 mem); classifyArgument has no effects when called on a plain FP type, but when called on a vector type, it modifies FreeRegs to model GPR consumption. However, this should not happen during the vector call first pass. I refactored the code to unify vectorcall HVA logic with regcall HVA logic. The conventions pass HVAs in registers differently (expanded vs. not expanded), but if they do not fit in registers, they both pass them indirectly by address. Reviewers: erichkeane, craig.topper Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72110
* [remark][diagnostics] Using clang diagnostic handler for IR input filesRong Xu2020-01-141-17/+76
| | | | | | | | | | | | | | | | | | | | | | | | For IR input files, we currently use LLVM diagnostic handler even the compilation is from clang. As a result, we are not able to use -Rpass to get the transformation reports. Some warnings are not handled properly either: We found many mysterious warnings in our ThinLTO backend compilations in SamplePGO and CSPGO. An example of the warning: "warning: net/proto2/public/metadata_lite.h:51:21: 0.02% (1 / 4999)" This turns out to be a warning by Wmisexpect, which is supposed to be filtered out by default. But since the filter is in clang's diagnostic hander, we emit these incomplete warnings from LLVM's diagnostic handler. This patch uses clang diagnostic handler for IR input files. We create a fake backendconsumer just to install the diagnostic handler. With this change, we will have proper handling of all the warnings and we can use -Rpass* options in IR input files compilation. Also note that with is patch, LLVM's diagnostic options, like "-mllvm -pass-remarks=*", are no longer be able to get optimization remarks. Differential Revision: https://reviews.llvm.org/D72523
* [OPENMP]Do not emit special virtual function for NVPTX target.Alexey Bataev2020-01-141-1/+6
| | | | | | There are no special virtual function handlers (like __cxa_pure_virtual) defined for NVPTX target, so just emit such functions as null pointers to prevent issues with linking and unresolved references.
* [DebugInfo] Add option to clang to limit debug info that is emitted for classes.Amy Huang2020-01-141-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch adds an option to limit debug info by only emitting complete class type information when its constructor is emitted. This applies to classes that have nontrivial user defined constructors. I implemented the option by adding another level to `DebugInfoKind`, and a flag `-flimit-debug-info-constructor`. Total object file size on Windows, compiling with RelWithDebInfo: before: 4,257,448 kb after: 2,104,963 kb And on Linux before: 9,225,140 kb after: 4,387,464 kb According to the Windows clang.pdb files, here is a list of types that are no longer complete with this option enabled: https://reviews.llvm.org/P8182 Reviewers: rnk, dblaikie Subscribers: aprantl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72427
* Fix "pointer is null" static analyzer warnings. NFCI.Simon Pilgrim2020-01-141-4/+2
| | | | Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
* Make helper functions static or move them into anonymous namespaces. NFC.Benjamin Kramer2020-01-142-2/+6
|
* [RISCV] Fix ILP32D lowering for double+double/double+int return typesJames Clarke2020-01-141-5/+15
| | | | | | | | | | | | | | | | | | Summary: Previously, since these aggregates are > 2*XLen, Clang would think they were being returned indirectly and thus would decrease the number of available GPRs available by 1. For long argument lists this could lead to a struct argument incorrectly being passed indirectly. Reviewers: asb, lenary Reviewed By: asb, lenary Subscribers: luismarques, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, lenary, s.egerton, pzheng, sameer.abuasal, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D69590
* [DebugInfo] Add another level to DebugInfoKind called ConstructorAmy Huang2020-01-137-35/+28
| | | | | | | | | The option will limit debug info by only emitting complete class type information when its constructor is emitted. This patch changes comparisons with LimitedDebugInfo to use the new level instead. Differential Revision: https://reviews.llvm.org/D72427
* [ItaniumCXXABI] Make tls wrappers properly comdatMartin Storsjö2020-01-131-0/+3
| | | | | | | | | | | | Just marking a symbol as weak_odr/linkonce_odr isn't enough for actually tolerating multiple copies of it at linking on windows, it has to be made a proper comdat; make it comdat for all platforms for consistency. This should hopefully fix https://bugzilla.mozilla.org/show_bug.cgi?id=1566288. Differential Revision: https://reviews.llvm.org/D71572
* Implement VectorType conditional operator GNU extension.Erich Keane2020-01-131-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC supports the conditional operator on VectorTypes that acts as a 'select' in C++ mode. This patch implements the support. Types are converted as closely to GCC's behavior as possible, though in a few places consistency with our existing vector type support was preferred. Note that this implementation is different from the OpenCL version in a number of ways, so it unfortunately required a different implementation. First, the SEMA rules and promotion rules are significantly different. Secondly, GCC implements COND[i] != 0 ? LHS[i] : RHS[i] (where i is in the range 0- VectorSize, for each element). In OpenCL, the condition is COND[i] < 0 ? LHS[i]: RHS[i]. In the process of implementing this, it was also required to make the expression COND ? LHS : RHS type dependent if COND is type dependent, since the type is now dependent on the condition. For example: T ? 1 : 2; Is not typically type dependent, since the result can be deduced from the operands. HOWEVER, if T is a VectorType now, it could change this to a 'select' (basically a swizzle with a non-constant mask) with the 1 and 2 being promoted to vectors themselves. While this is a change, it is NOT a standards incompatible change. Based on my (and D. Gregor's, at the time of writing the code) reading of the standard, the expression is supposed to be type dependent if ANY sub-expression is type dependent. Differential Revision: https://reviews.llvm.org/D71463
* This option allows selecting the TLS size in the local exec TLS model,KAWASHIMA Takahiro2020-01-131-0/+1
| | | | | | | | | | | | | | | | | | which is the default TLS model for non-PIC objects. This allows large/ many thread local variables or a compact/fast code in an executable. Specification is same as that of GCC. For example, the code model option precedes the TLS size option. TLS access models other than local-exec are not changed. It means supoort of the large code model is only in the local exec TLS model. Patch By KAWASHIMA Takahiro (kawashima-fj <t-kawashima@fujitsu.com>) Reviewers: dmgreen, mstorsjo, t.p.northover, peter.smith, ostannard Reviewd By: peter.smith Committed by: peter.smith Differential Revision: https://reviews.llvm.org/D71688
* Revert "[DWARF5][clang]: Added support for DebugInfo generation for auto ↵Sam McCall2020-01-132-21/+11
| | | | | | | | | return type for C++ member functions." This reverts commit 6d6a4590c5d4c7fc7445d72fe685f966b0a8cafb, which introduces a crash. See https://reviews.llvm.org/D70524 for details.
* [DWARF5][clang]: Added support for DebugInfo generation for auto return type ↵Awanish Pandey2020-01-132-11/+21
| | | | | | | | | | | | | | | | | for C++ member functions. Summary: This patch will provide support for auto return type for the C++ member functions. This patch includes clang side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524
* Fix "pointer is null" static analyzer warning. NFCI.Simon Pilgrim2020-01-111-1/+1
| | | | Use cast<> instead of dyn_cast<> since we know that the pointer should be valid (and is dereferenced immediately).
* Fix "pointer is null" static analyzer warnings. NFCI.Simon Pilgrim2020-01-113-17/+10
| | | | Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
* Only destroy static locals if they have non-trivial destructors.Richard Smith2020-01-101-1/+2
| | | | | | | This fixes a regression introduced in 2b4fa5348ee157b6b1a1af44d0137ca8c7a71573 that caused us to emit shutdown-time destruction for variables with ARC ownership, using C++-specific functions that don't exist in C implementations.
* [Driver][CodeGen] Add -fpatchable-function-entry=N[,0]Fangrui Song2020-01-101-0/+3
| | | | | | | | | | | In the backend, this feature is implemented with the function attribute "patchable-function-entry". Both the attribute and XRay use TargetOpcode::PATCHABLE_FUNCTION_ENTER, so the two features are incompatible. Reviewed By: ostannard, MaskRay Differential Revision: https://reviews.llvm.org/D72222
* Support function attribute patchable_function_entryFangrui Song2020-01-101-1/+7
| | | | | | | | | This feature is generic. Make it applicable for AArch64 and X86 because the backend has only implemented NOP insertion for AArch64 and X86. Reviewed By: nickdesaulniers, aaron.ballman Differential Revision: https://reviews.llvm.org/D72221
* Fix "pointer is null" static analyzer warning. NFCI.Simon Pilgrim2020-01-101-1/+1
| | | | Use castAs<> instead of getAs<> since the pointer is dereferenced immediately below and castAs will perform the null assertion for us.
* Add support for __declspec(guard(nocf))Andrew Paverd2020-01-101-0/+11
| | | | | | | | | | | | | | | | | | Summary: Avoid using the `nocf_check` attribute with Control Flow Guard. Instead, use a new `"guard_nocf"` function attribute to indicate that checks should not be added on indirect calls within that function. Add support for `__declspec(guard(nocf))` following the same syntax as MSVC. Reviewers: rnk, dmajor, pcc, hans, aaron.ballman Reviewed By: aaron.ballman Subscribers: aaron.ballman, tomrittervg, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72167
* [FPEnv] Generate constrained FP comparisons from clangUlrich Weigand2020-01-101-11/+17
| | | | | | | | | | | | | | | | | | | | | Update the IRBuilder to generate constrained FP comparisons in CreateFCmp when IsFPConstrained is true, similar to the other places in the IRBuilder. Also, add a new CreateFCmpS to emit signaling FP comparisons, and use it in clang where comparisons are supposed to be signaling (currently, only when emitting code for the <, <=, >, >= operators). Note that there is currently no way to add fast-math flags to a constrained FP comparison, since this is implemented as an intrinsic call that returns a boolean type, and FMF are only allowed for calls returning a floating-point type. However, given the discussion around https://bugs.llvm.org/show_bug.cgi?id=42179, it seems that FCmp itself really shouldn't have any FMF either, so this is probably OK. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71467
* Allow system header to provide their own implementation of some builtinserge-sans-paille2020-01-102-1/+13
| | | | | | | | | | | | | | | | | | If a system header provides an (inline) implementation of some of their function, clang still matches on the function name and generate the appropriate llvm builtin, e.g. memcpy. This behavior is in line with glibc recommendation « users may not provide their own version of symbols » but doesn't account for the fact that glibc itself can provide inline version of some functions. It is the case for the memcpy function when -D_FORTIFY_SOURCE=1 is on. In that case an inline version of memcpy calls __memcpy_chk, a function that performs extra runtime checks. Clang currently ignores the inline version and thus provides no runtime check. This code fixes the issue by detecting functions whose name is a builtin name but also have an inline implementation. Differential Revision: https://reviews.llvm.org/D71082
* [ThinLTO] Pass CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLPWei Mi2020-01-091-0/+6
| | | | | | | | | | | | | down to pass builder in ltobackend. Currently CodeGenOpts like UnrollLoops/VectorizeLoop/VectorizeSLP in clang are not passed down to pass builder in ltobackend when new pass manager is used. This is inconsistent with the behavior when new pass manager is used and thinlto is not used. Such inconsistency causes slp vectorization pass not being enabled in ltobackend for O3 + thinlto right now. This patch fixes that. Differential Revision: https://reviews.llvm.org/D72386
* Add builtins for aligning and checking alignment of pointers and integersAlex Richardson2020-01-092-0/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change introduces three new builtins (which work on both pointers and integers) that can be used instead of common bitwise arithmetic: __builtin_align_up(x, alignment), __builtin_align_down(x, alignment) and __builtin_is_aligned(x, alignment). I originally added these builtins to the CHERI fork of LLVM a few years ago to handle the slightly different C semantics that we use for CHERI [1]. Until recently these builtins (or sequences of other builtins) were required to generate correct code. I have since made changes to the default C semantics so that they are no longer strictly necessary (but using them does generate slightly more efficient code). However, based on our experience using them in various projects over the past few years, I believe that adding these builtins to clang would be useful. These builtins have the following benefit over bit-manipulation and casts via uintptr_t: - The named builtins clearly convey the semantics of the operation. While checking alignment using __builtin_is_aligned(x, 16) versus ((x & 15) == 0) is probably not a huge win in readably, I personally find __builtin_align_up(x, N) a lot easier to read than (x+(N-1))&~(N-1). - They preserve the type of the argument (including const qualifiers). When using casts via uintptr_t, it is easy to cast to the wrong type or strip qualifiers such as const. - If the alignment argument is a constant value, clang can check that it is a power-of-two and within the range of the type. Since the semantics of these builtins is well defined compared to arbitrary bit-manipulation, it is possible to add a UBSAN checker that the run-time value is a valid power-of-two. I intend to add this as a follow-up to this change. - The builtins avoids int-to-pointer casts both in C and LLVM IR. In the future (i.e. once most optimizations handle it), we could use the new llvm.ptrmask intrinsic to avoid the ptrtoint instruction that would normally be generated. - They can be used to round up/down to the next aligned value for both integers and pointers without requiring two separate macros. - In many projects the alignment operations are already wrapped in macros (e.g. roundup2 and rounddown2 in FreeBSD), so by replacing the macro implementation with a builtin call, we get improved diagnostics for many call-sites while only having to change a few lines. - Finally, the builtins also emit assume_aligned metadata when used on pointers. This can improve code generation compared to the uintptr_t casts. [1] In our CHERI compiler we have compilation mode where all pointers are implemented as capabilities (essentially unforgeable 128-bit fat pointers). In our original model, casts from uintptr_t (which is a 128-bit capability) to an integer value returned the "offset" of the capability (i.e. the difference between the virtual address and the base of the allocation). This causes problems for cases such as checking the alignment: for example, the expression `if ((uintptr_t)ptr & 63) == 0` is generally used to check if the pointer is aligned to a multiple of 64 bytes. The problem with offsets is that any pointer to the beginning of an allocation will have an offset of zero, so this check always succeeds in that case (even if the address is not correctly aligned). The same issues also exist when aligning up or down. Using the alignment builtins ensures that the address is used instead of the offset. While I have since changed the default C semantics to return the address instead of the offset when casting, this offset compilation mode can still be used by passing a command-line flag. Reviewers: rsmith, aaron.ballman, theraven, fhahn, lebedev.ri, nlopes, aqjune Reviewed By: aaron.ballman, lebedev.ri Differential Revision: https://reviews.llvm.org/D71499
* [Clang] Handle target-specific builtins returning aggregates.Simon Tatham2020-01-091-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: A few of the ARM MVE builtins directly return a structure type. This causes an assertion failure at code-gen time if you try to assign the result of the builtin to a variable, because the `RValue` created in `EmitBuiltinExpr` from the `llvm::Value` produced by codegen is always made by `RValue::get()`, which creates a non-aggregate `RValue` that will fail an assertion when `AggExprEmitter::withReturnValueSlot` calls `Src.getAggregatePointer()`. A similar failure occurs if you try to use the struct return value directly to extract one field, e.g. `vld2q(address).val[0]`. The existing code-gen tests for those MVE builtins pass the returned structure type directly to the C `return` statement, which apparently managed to avoid that particular code path, so we didn't notice the crash. Now `EmitBuiltinExpr` checks the evaluation kind of the builtin's return value, and does the necessary handling for aggregate returns. I've added two extra test cases, both of which crashed before this change. Reviewers: dmgreen, rjmccall Reviewed By: rjmccall Subscribers: kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D72271
* Improve support of GNU mempcpyserge-sans-paille2020-01-091-2/+8
| | | | | | | - Lower to the memcpy intrinsic - Raise warnings when size/bounds are known Differential Revision: https://reviews.llvm.org/D71374
* [OPENMP]Remove unused code, NFC.Alexey Bataev2020-01-092-102/+2
|
* Fix "pointer is null" static analyzer warnings. NFCI.Simon Pilgrim2020-01-091-0/+2
| | | | Assert that the pointers are non-null before dereferencing them.
* Fix MSVC unhandled enum warning. NFCI.Simon Pilgrim2020-01-091-0/+1
|
* Fix "pointer is null" static analyzer warning. NFCI.Simon Pilgrim2020-01-081-2/+2
| | | | Use castAs<> instead of getAs<> since we know that the pointer will be valid (and is dereferenced immediately below).
* [OPENMP]Reduce calls for the mangled names.Alexey Bataev2020-01-072-8/+9
| | | | | | Use canonical decls instead of mangled names in the set of already emitted decls. This allows to reduce the number of function calls for getting declarations mangled names and speedup the compilation.
* [HIP] Add option --gpu-max-threads-per-block=nYaxun (Sam) Liu2020-01-071-2/+5
| | | | | | Add this option to change the default launch bounds. Differential Revision: https://reviews.llvm.org/D71221
* [NFC] Use isX86() instead of getArch()Jim Lin2020-01-072-6/+2
| | | | | | | | | | | | | | Summary: This is a clean up for https://reviews.llvm.org/D72247. Reviewers: MaskRay, craig.topper, jhenderson Reviewed By: MaskRay Subscribers: hiraditya, rupprecht, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72320
* [CodeGen][ObjC] Push the properties of a protocol before pushing theAkira Hatanaka2020-01-061-3/+3
| | | | | | | | | | properties of the protocol it inherits This fixes a bug where the type string for a @dynamic property of an @implementation didn't have 'D' in it when the protocol it conforms to redeclares the property declared in the base protocol. rdar://problem/45503561
* Add Triple::isX86()Fangrui Song2020-01-061-2/+1
| | | | | | Reviewed By: craig.topper, skan Differential Revision: https://reviews.llvm.org/D72247
* [OPENMP50]Support lastprivate conditional updates in inc/dec unary ops.Alexey Bataev2020-01-065-13/+58
| | | | | Added support for checking of updates of variables used in unary pre(pos) inc/dec expressions.
* [OPENMP50]Codegen for lastprivate conditional list items.Alexey Bataev2020-01-026-3/+357
| | | | | | | | | | | | | | | | | | | | | | | | | Added codegen support for lastprivate conditional. According to the standard, if when the conditional modifier appears on the clause, if an assignment to a list item is encountered in the construct then the original list item is assigned the value that is assigned to the new list item in the sequentially last iteration or lexically last section in which such an assignment is encountered. We look for the assignment operations and check if the left side references lastprivate conditional variable. Then the next code is emitted: if (last_iv_a <= iv) { last_iv_a = iv; last_a = lp_a; } At the end the implicit barrier is generated to wait for the end of all threads and then in the check for the last iteration the private copy is assigned the last value. if (last_iter) { lp_a = last_a; // <--- new code a = lp_a; // <--- store of private value to the original variable. }
OpenPOWER on IntegriCloud