summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [CUDA] Implemented _[bi]mma* builtins.Artem Belevich2019-04-251-210/+333
| | | | | | | | | | | | | | | | These builtins provide access to the new integer and sub-integer variants of MMA (matrix multiply-accumulate) instructions provided by CUDA-10.x on sm_75 (AKA Turing) GPUs. Also added a feature for PTX 6.4. While Clang/LLVM does not generate any PTX instructions that need it, we still need to pass it through to ptxas in order to be able to compile code that uses the new 'mma' instruction as inline assembly (e.g used by NVIDIA's CUTLASS library https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101) Differential Revision: https://reviews.llvm.org/D60279 llvm-svn: 359248
* Skip type units/type uniquing when we know we're only emitting the type once ↵David Blaikie2019-04-251-0/+5
| | | | | | | | | | | | | | | (vtable-based emission when triggered by a strong vtable, with -fno-standalone-debug) (this would regress size without a corresponding LLVM change that avoids putting other user defined types inside type units when they aren't in their own type units - instead emitting declarations inside the type unit and a definition in the primary CU) Reviewers: aprantl Differential Revision: https://reviews.llvm.org/D61079 llvm-svn: 359235
* [PGO] Enable InstrProf lowering for Clang PGO instrumentation in the new ↵Rong Xu2019-04-251-10/+27
| | | | | | | | | | | | | | | pass manager Currently InstrProf lowering is not enabled for Clang PGO instrumentation in the new pass manager. The following command "-fprofile-instr-generate -fexperimental-new-pass-manager ..." is broken. This CL enables InstrProf lowering pass for Clang PGO instrumentation in the new pass manager. Differential Revision: https://reviews.llvm.org/D61138 llvm-svn: 359215
* Fix bug 37903:MS ABI: handle inline static data member and inline variable ↵Jennifer Yu2019-04-251-1/+7
| | | | | | as template static data member llvm-svn: 359212
* [codeview] Fix symbol names for dynamic initializers and atexit stubsReid Kleckner2019-04-243-6/+70
| | | | | | | | | | | | | | | | Summary: Add a new variant to GlobalDecl for these so that we can detect them more easily during debug info emission and handle them appropriately. Reviewers: rsmith, rjmccall, jyu2 Subscribers: aprantl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60930 llvm-svn: 359148
* Use llvm::stable_sortFangrui Song2019-04-247-21/+16
| | | | llvm-svn: 359098
* [Builtins] Implement __builtin_is_constant_evaluated for use in C++2aEric Fiselier2019-04-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements `__builtin_is_constant_evaluated` as specifier by [P0595R2](https://wg21.link/p0595r2). It is built on the back of Bill Wendling's work for `__builtin_constant_p()`. More tests to come, but early feedback is appreciated. I plan to implement warnings for common mis-usages like those belowe in a following patch: ``` void foo(int x) { if constexpr (std::is_constant_evaluated())) { // condition is always `true`. Should use plain `if` instead. foo_constexpr(x); } else { foo_runtime(x); } } ``` Reviewers: rsmith, MaskRay, bruno, void Reviewed By: rsmith Subscribers: dexonsmith, zoecarver, fdeazeve, kristina, cfe-commits Differential Revision: https://reviews.llvm.org/D55500 llvm-svn: 359067
* Move setTargetAttributes after setGVProperties in SetFunctionAttributesScott Linder2019-04-231-5/+5
| | | | | | | | | | | AMDGPU currently relies on global properties being set before setTargetProperties is called. Existing targets like MIPS which rely on setTargetProperties do not rely on the current behavior, so this patch moves the call later in SetFunctionAttributes. Differential Revision: https://reviews.llvm.org/D60967 llvm-svn: 359039
* Revert "[MS] Emit S_HEAPALLOCSITE debug info" because of ToTWin64(db)Amy Huang2019-04-231-0/+1
| | | | | | | | | buildbot failure. This reverts commit d07d6d617713bececf57f3547434dd52f0f13f9e and c774f687b6880484a126ed3e3d737e74c926f0ae. llvm-svn: 359034
* [ThinLTO] Pass down opt level to LTO backend and handle -O0 LTO in new PMTeresa Johnson2019-04-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Summary: The opt level was not being passed down to the ThinLTO backend when invoked via clang (for distributed ThinLTO). This exposed an issue where the new PM was asserting if the Thin or regular LTO backend pipelines were invoked with -O0 (not a new issue, could be provoked by invoking in-process *LTO backends via linker using new PM and -O0). Fix this similar to the old PM where -O0 only does the necessary lowering of type metadata (WPD and LowerTypeTest passes) and then quits, rather than asserting. Reviewers: xur Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D61022 llvm-svn: 359025
* [MS] Emit S_HEAPALLOCSITE debug infoAmy Huang2019-04-191-1/+0
| | | | | | | | | | | | | | | | | Summary: This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug info in codeview. Currently only changes FastISel, so emitting labels still needs to be implemented in SelectionDAG. Reviewers: hans, rnk Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60800 llvm-svn: 358783
* [OPENMP][NVPTX] target [teams distribute] simd maybe run withoutAlexey Bataev2019-04-191-1/+8
| | | | | | | | | runtime. target [teams distribute] simd costructs do not require full runtime for the correct execution, we can run them without full runtime. llvm-svn: 358766
* Update to use PipelineTuningOptions. Corresponds to llvm change: D59723.Alina Sbirlea2019-04-191-1/+1
| | | | llvm-svn: 358765
* [OPENMP][NVPTX]Run combined constructs with if clause in SPMD mode.Alexey Bataev2019-04-172-44/+97
| | | | | | | | All target-parallel-based constructs can be run in SPMD mode from now on. Even if num_threads clauses or if clauses are used, such constructs can be executed in SPMD mode. llvm-svn: 358595
* [OPENMP][NVPTX]Run combined constructs with if clause in SPMD mode.Alexey Bataev2019-04-161-6/+8
| | | | | | | | | Combined constructs with parallel and if clauses without modifiers may be executed in SPMD mode since if the condition is true for the target region, it is also true for parallel region and the threads must be run in parallel. llvm-svn: 358503
* [AArch64] Implement Vector Funtion ABI name mangling.Alexey Bataev2019-04-161-3/+318
| | | | | | | | | | | | | | | | | | | Summary: The name mangling scheme is defined in section 3.5 of the "Vector function application binary interface specification for AArch64" [1]. [1] https://developer.arm.com/products/software-development-tools/hpc/arm-compiler-for-hpc/vector-function-abi Reviewers: rengolin, ABataev Reviewed By: ABataev Subscribers: sdesmalen, javed.absar, kristof.beyls, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60583 llvm-svn: 358490
* [OPENMP][NVPTX]Run parallel regions with num_threads clauses in SPMDAlexey Bataev2019-04-151-10/+7
| | | | | | | | | | mode. After the previous patch with the more correct handling of the number of threads in parallel regions, the parallel regions with num_threads clauses can be executed in SPMD mode. llvm-svn: 358445
* Relanding r357928 with fixed debuginfo check.Amy Huang2019-04-123-7/+27
| | | | | | | | | | | | | [MS] Add metadata for __declspec(allocator) Original summary: Emit !heapallocsite in the metadata for calls to functions marked with __declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug info in codeview. Differential Revision: https://reviews.llvm.org/D60237 llvm-svn: 358307
* [Aarch64] Add v8.2-a half precision element extract intrinsicsDiogo N. Sampaio2019-04-121-0/+8
| | | | | | | | | | | | | | | | | | | | Summary: Implements the intrinsics define on the ACLE to extract half precision fp scalar elements from float16x4_t and float16x8_t vector types. a.k.a: vduph_lane_f16 vduph_laneq_f16 Reviewers: pablooliveira, olista01, LukeGeeson, DavidSpickett Reviewed By: DavidSpickett Subscribers: DavidSpickett, javed.absar, kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60272 llvm-svn: 358276
* Variable auto-init: also auto-init allocaJF Bastien2019-04-125-84/+149
| | | | | | | | | | | | | | | | | | Summary: alloca isn’t auto-init’d right now because it’s a different path in clang that all the other stuff we support (it’s a builtin, not an expression). Interestingly, alloca doesn’t have a type (as opposed to even VLA) so we can really only initialize it with memset. <rdar://problem/49794007> Subscribers: jkorous, dexonsmith, cfe-commits, rjmccall, glider, kees, kcc, pcc Tags: #clang Differential Revision: https://reviews.llvm.org/D60548 llvm-svn: 358243
* Revert r357610, it caused PR41471Nico Weber2019-04-111-6/+0
| | | | llvm-svn: 358232
* [DebugInfo] Combine Trivial and NonTrivial flagsAaron Smith2019-04-111-4/+2
| | | | | | | | | | | | | | | | | | | | | Summary: These flags are used when emitting debug info and needed to initialize subprogram and member function attributes (function options) for Codeview. These function options are used to create an accurate compiler type for UDT symbols (class/struct/union) from PDBs. The Trivial flag was introduced in https://reviews.llvm.org/D45122 It's been pointed out that Trivial and NonTrivial may imply each other and that seems to be the case in the current tests. This change combines them into a single flag -- NonTrivial -- and updates the corresponding unit tests. There is an additional change to llvm to update the flags. Reviewers: rnk, zturner, dblaikie, probinson, Hui Reviewed By: dblaikie Subscribers: aprantl, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D59347 llvm-svn: 358219
* Support objc_nonlazy_class attribute on Objective-C implementationsErik Pilkington2019-04-111-1/+2
| | | | | | | | Fixes rdar://49523079 Differential revision: https://reviews.llvm.org/D60544 llvm-svn: 358201
* Add { } to silence compiler warning [NFC]Mikael Holmen2019-04-111-3/+3
| | | | | | | | | | | | | | | At least clang 3.6 warns on the original code: ../tools/clang/lib/CodeGen/CGNonTrivialStruct.cpp:829:34: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] return std::array<Address, 1>({Address(nullptr, CharUnits::Zero())}); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ { } ../tools/clang/lib/CodeGen/CGNonTrivialStruct.cpp:833:34: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] return std::array<Address, 2>({Address(nullptr, CharUnits::Zero()), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2 errors generated. llvm-svn: 358152
* [OpenCL] Re-fix invalid address space generation for clk_event_t arguments ↵Alexey Sotkin2019-04-111-13/+28
| | | | | | | | | | | | | | | | | | | of enqueue_kernel builtin function Summary: https://reviews.llvm.org/D53809 fixed wrong address space(assert in debug build) generated for event_ret argument. But exactly the same problem exists for event_wait_list argument. This patch should fix both. Reviewers: Anastasia, yaxunl Reviewed By: Anastasia Subscribers: kristina, ebevhan, cfe-commits Differential Revision: https://reviews.llvm.org/D59985 llvm-svn: 358151
* Add IRGen APIs to fetch ctor/dtor helper functions for non-trivial structs.John McCall2019-04-101-0/+90
| | | | | | Patch by Tony Allevato! llvm-svn: 358132
* [OPENMP]Improve detection of number of teams, threads in targetAlexey Bataev2019-04-103-260/+393
| | | | | | | | | | regions. Added more complex analysis for number of teams and number of threads in the target regions, also merged related common code between CGOpenMPRuntime and CGOpenMPRuntimeNVPTX classes. llvm-svn: 358126
* Fix an off-by-one mistake in IRGen's copy-constructionJohn McCall2019-04-101-1/+1
| | | | | | | | special cases in the presence of zero-length arrays. Patch by Joran Bigalet! llvm-svn: 358115
* Don't emit an unreachable return block.John McCall2019-04-102-9/+15
| | | | | | Patch by Brad Moody. llvm-svn: 358104
* [CodeGen][ObjC] Emit the retainRV marker as a module flag instead ofAkira Hatanaka2019-04-101-8/+4
| | | | | | | | | | | | | | named metadata. This fixes a bug where ARC contract wasn't inserting the retainRV marker when LTO was enabled, which caused objects returned from a function to be auto-released. rdar://problem/49464214 Differential Revision: https://reviews.llvm.org/D60302 llvm-svn: 358048
* Revert "[MS] Add metadata for __declspec(allocator)"Amy Huang2019-04-083-27/+7
| | | | | | | This reverts commit e7bd735bb03a7b8141e32f7d6cb98e8914d8799e. Reverting because of buildbot failure. llvm-svn: 357952
* [OPENMP] Sync __kmpc_alloc/_kmpc_free function with the runtime.Alexey Bataev2019-04-081-10/+15
| | | | | | | Functions __kmpc_alloc/__kmpc_free are updated with the new interfaces. Patch synchronizes the compiler with the runtime. llvm-svn: 357933
* [MS] Add metadata for __declspec(allocator)Amy Huang2019-04-083-7/+27
| | | | | | | | | | | | | | | | | Summary: Emit !heapallocsite in the metadata for calls to functions marked with __declspec(allocator). Eventually this will be emitted as S_HEAPALLOCSITE debug info in codeview. Reviewers: rnk Subscribers: jfb, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D60237 llvm-svn: 357928
* [OPENMP][NVPTX]Fixed processing of memory management directives.Alexey Bataev2019-04-081-14/+42
| | | | | | | | | | | Added special processing of the memory management directives/clauses for NVPTX target. For private locals, omp_default_mem_alloc and omp_thread_mem_alloc result in allocation in local memory. omp_const_mem_alloc allocates const memory, omp_teams_mem_alloc allocates shared memory, and omp_cgroup_mem_alloc and omp_large_cap_mem_alloc allocate global memory. llvm-svn: 357923
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-041-1/+1
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* Make SourceManager::createFileID(UnownedTag, ...) take a const ↵Nico Weber2019-04-042-3/+4
| | | | | | | | | | | | | | | | | | | | | llvm::MemoryBuffer* Requires making the llvm::MemoryBuffer* stored by SourceManager const, which in turn requires making the accessors for that return const llvm::MemoryBuffer*s and updating all call sites. The original motivation for this was to use it and fix the TODO in CodeGenAction.cpp's ConvertBackendLocation() by using the UnownedTag version of createFileID, and since llvm::SourceMgr* hands out a const llvm::MemoryBuffer* this is required. I'm not sure if fixing the TODO this way actually works, but this seems like a good change on its own anyways. No intended behavior change. Differential Revision: https://reviews.llvm.org/D60247 llvm-svn: 357724
* [PR41276] Fixed incorrect generation of addr space cast for 'this' in C++.Anastasia Stulova2019-04-044-22/+26
| | | | | | | | | | | | | | | Improved classification of address space cast when qualification conversion is performed - prevent adding addr space cast for non-pointer and non-reference types. Take address space correctly from the pointee. Also pass correct address space from 'this' object using AggValueSlot when generating addrspacecast in the constructor call. Differential Revision: https://reviews.llvm.org/D59988 llvm-svn: 357682
* add periodsAmy Huang2019-04-031-6/+6
| | | | llvm-svn: 357643
* [IR] Create new method in `Function` class (NFC)Evandro Menezes2019-04-031-2/+1
| | | | | | | | | Create method `optForNone()` testing for the function level equivalent of `-O0` and refactor appropriately. Differential revision: https://reviews.llvm.org/D59852 llvm-svn: 357638
* [OPENMP]Add codegen for firstprivate vars with allocate clause.Alexey Bataev2019-04-033-14/+18
| | | | | | | Added codegen/test for the firstprivatized variables with the allocate clause. llvm-svn: 357617
* Bug-40323: MS ABI adding template static member in the linker directive ↵Jennifer Yu2019-04-031-0/+6
| | | | | | section to make sure init function can be called before main. llvm-svn: 357610
* [HIP-Clang] Fat binary should not be produced for non GPU codeAaron Enye Shi2019-04-021-2/+2
| | | | | | | | | | clang-format the changes to CUDA and HIP fat binary. Reviewers: yaxunl, tra Differential Revision: https://reviews.llvm.org/D60141 llvm-svn: 357532
* [HIP-Clang] Fat binary should not be produced for non GPU code 2Aaron Enye Shi2019-04-021-1/+3
| | | | | | | | | | Also for CUDA, we need to disable producing these fat binary functions when there is no GPU code. Reviewers: yaxunl, tra Differential Revision: https://reviews.llvm.org/D60141 llvm-svn: 357526
* [HIP-Clang] Fat binary should not be produced for non GPU codeAaron Enye Shi2019-04-021-0/+2
| | | | | | | | | | Skip producing the fat binary functions for HIP when no device code is present. Reviewers: yaxunl Differential Review: https://reviews.llvm.org/D60141 llvm-svn: 357520
* [CodeGen] Fix a regression by emitting lambda expressions in EmitLValueErik Pilkington2019-04-021-0/+2
| | | | | | | | | | | This ability was removed in r351487, but it's needed when a lambda appears as an OpaqueValueExpr subexpression of a PseudoObjectExpr. rdar://49030379 Differential revision: https://reviews.llvm.org/D60099 llvm-svn: 357515
* [OPENMP]Add codegen for private vars with allocate clause.Alexey Bataev2019-04-021-6/+2
| | | | | | | Added codegen/test for the privatized variables with the allocate clause. llvm-svn: 357514
* [os_log] Mark os_log_helper `nounwind`Vedant Kumar2019-04-021-0/+1
| | | | | | | | | | | | | Allow the optimizer to remove unnecessary EH cleanups surrounding calls to os_log_helper, to save some code size. As a follow-up, it might be worthwhile to add a BasicNoexcept exception spec to os_log_helper, and to then teach CGCall to emit direct calls for callees which can't throw. This could save some compile-time. Differential Revision: https://reviews.llvm.org/D60108 llvm-svn: 357501
* [OPENMP]Fix mapping of the pointers captured by reference.Alexey Bataev2019-04-021-3/+12
| | | | | | | | If the pointer is captured by reference, it must be mapped as _PTR_AND_OBJ kind of mapping to correctly translate the pointer address on the device. llvm-svn: 357488
* [CodeGen] Generate follow-up metadata for loops with more than one ↵Michael Kruse2019-04-012-94/+530
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | transformation. Before this patch, CGLoop would dump all transformations for a loop into a single LoopID without encoding any order in which to apply them. rL348944 added the possibility to encode a transformation order using followup-attributes. When a loop has more than one transformation, use the follow-up attribute define the order in which they are applied. The emitted order is the defacto order as defined by the current LLVM pass pipeline, which is: LoopFullUnrollPass LoopDistributePass LoopVectorizePass LoopUnrollAndJamPass LoopUnrollPass MachinePipeliner This patch should therefore not change the assembly output, assuming that all explicit transformations can be applied, and no implicit transformations in-between. In the former case, WarnMissedTransformationsPass should emit a warning (except for MachinePipeliner which is not implemented yet). The latter could be avoided by adding 'llvm.loop.disable_nonforced' attributes. Because LoopUnrollAndJamPass processes a loop nest, generation of the MDNode is delayed to after the inner loop metadata have been processed. A temporary LoopID is therefore used to annotate instructions and RAUW'ed by the actual LoopID later. Differential Revision: https://reviews.llvm.org/D57978 llvm-svn: 357415
* [gnustep-objc] Make the GNUstep v2 ABI work for Windows DLLs.David Chisnall2019-03-311-29/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Based on a patch by Dustin Howett, modified to not change the ABI for ELF platforms. Use more Windows-like section names. This also makes things more readable by PE/COFF debug tools that assume sections fit in the first header. With these changes in, it is now possible to build a working WinObjC with clang and the WinObjC version of GNUstep libobjc (upstream GNUstep libobjc + a work around for incremental linking, which can be removed once LINK.EXE gains a feature to opt sections out of receiving extra padding during an incremental link). Patch by Dustin Howett! Reviewers: DHowett-MSFT Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58724 llvm-svn: 357364
OpenPOWER on IntegriCloud