summaryrefslogtreecommitdiffstats
path: root/clang/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Use FNeg in s390x clang builtinsKevin P. Neal2020-01-021-9/+5
| | | | The s390x builtins are still using FSub instead of FNeg. Correct that.
* Generalize the pass registration mechanism used by Polly to any third-party toolserge_sans_paille2020-01-022-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | There's quite a lot of references to Polly in the LLVM CMake codebase. However the registration pattern used by Polly could be useful to other external projects: thanks to that mechanism it would be possible to develop LLVM extension without touching the LLVM code base. This patch has two effects: 1. Remove all code specific to Polly in the llvm/clang codebase, replaicing it with a generic mechanism 2. Provide a generic mechanism to register compiler extensions. A compiler extension is similar to a pass plugin, with the notable difference that the compiler extension can be configured to be built dynamically (like plugins) or statically (like regular passes). As a result, people willing to add extra passes to clang/opt can do it using a separate code repo, but still have their pass be linked in clang/opt as built-in passes. Differential Revision: https://reviews.llvm.org/D61446
* [NFC] Fixes -Wrange-loop-analysis warningsMark de Wever2020-01-012-2/+2
| | | | | | This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857
* [MC][TargetMachine] Delete MCTargetOptions::MCPIECopyRelocationsFangrui Song2020-01-011-1/+0
| | | | | | | | | | | | clang/lib/CodeGen/CodeGenModule performs the -mpie-copy-relocations check and sets dso_local on applicable global variables. We don't need to duplicate the work in TargetMachine shouldAssumeDSOLocal. Verified that -mpie-copy-relocations can still emit PC relative relocations for external variable accesses. clang -target x86_64 -fpie -mpie-copy-relocations -c => R_X86_64_PC32 clang -target aarch64 -fpie -mpie-copy-relocations -c => R_AARCH64_ADR_PREL_PG_HI21+R_AARCH64_LDST64_ABS_LO12_NC
* [OPENMP]Emit artificial threprivate vars as threadlocal, if possible.Alexey Bataev2019-12-311-2/+7
| | | | It may improve performance for declare reduction constructs.
* [CodeGen] Emit conj/conjf/confjl libcalls as fneg instructions if possible.Craig Topper2019-12-311-1/+4
| | | | | | | We already recognize the __builtin versions of these, might as well recognize the libcall version. Differential Revision: https://reviews.llvm.org/D72028
* [CodeGen] Use IRBuilder::CreateFNeg for __builtin_conjCraig Topper2019-12-301-6/+1
| | | | | | This replaces the fsub -0.0 idiom with an fneg instruction. We didn't see to have a test that showed the current codegen. Just some tests for constant folding and a test that was only checking the declare lines for libcalls. The latter just checked that we did not have a declare for @conj when using __builtin_conj. Differential Revision: https://reviews.llvm.org/D72012
* [CodeGen] Use CreateFNeg in buildFMulAddCraig Topper2019-12-301-11/+4
| | | | | | We have an fneg instruction now and should use it instead of the fsub -0.0 idiom. Looks like we had no test that showed that we handled the negation cases here so I've added new tests. Differential Revision: https://reviews.llvm.org/D72010
* [OpenMP] Use the OpenMPIRBuilder for `omp parallel`Johannes Doerfert2019-12-301-0/+95
| | | | | | | | | | | | This allows to use the OpenMPIRBuilder for parallel regions. Code was extracted from D61953 and adapted to work with the new version (D70109). All but one feature should be supported. An update of this patch will provide test coverage and privatization other than shared. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D70290
* [OpenMP][NFCI] Use the libFrontend ProcBindKind in ClangJohannes Doerfert2019-12-264-30/+9
| | | | | | | | | | | | This removes the OpenMPProcBindClauseKind enum in favor of llvm::omp::ProcBindKind which lives in OpenMPConstants.h and was introduced in D70109. No change in behavior is expected. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D70289
* [OpenMP][IR-Builder] Introduce the finalization stackJohannes Doerfert2019-12-251-15/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a permanent and generic solution to the problem of variable finalization (destructors, lastprivate, ...), this patch introduces the finalization stack. The objects on the stack describe (1) the (structured) regions the OpenMP-IR-Builder is currently constructing, (2) if these are cancellable, and (3) the callback that will perform the finalization (=cleanup) when necessary. As the finalization can be necessary multiple times, at different source locations, the callback takes the position at which code is currently generated. This position will also encode the destination of the "region exit" block *iff* the finalization call was issues for a region generated by the OpenMPIRBuilder. For regions generated through the old Clang OpenMP code geneneration, the "region exit" is determined by Clang inside the finalization call instead (see getOMPCancelDestination). As a first user, the parallel + cancel barrier interaction is changed. In contrast to the temporary solution before, the barrier generation in Clang does not need to be aware of the "CancelDestination" block. Instead, the finalization callback is and, as described above, later even that one does not need to be. D70109 will be updated to use this scheme. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D70258
* [NFC] Remove some dead code from CGBuiltin.cpp.Kevin P. Neal2019-12-241-14/+0
|
* [Sema][X86] Consider target attribute into the checks in validateOutputSize ↵Craig Topper2019-12-233-64/+6
| | | | | | | | | | | | | and validateInputSize. The validateOutputSize and validateInputSize need to check whether AVX or AVX512 are enabled. But this can be affected by the target attribute so we need to factor that in. This patch moves some of the code from CodeGen to create an appropriate feature map that we can pass to the function. Differential Revision: https://reviews.llvm.org/D68627
* [OPENMP50]Codegen for nontemporal clause.Alexey Bataev2019-12-234-7/+91
| | | | | | | | | | | | | | | | Summary: Basic codegen for the declarations marked as nontemporal. Also, if the base declaration in the member expression is marked as nontemporal, lvalue for member decl access inherits nonteporal flag from the base lvalue. Reviewers: rjmccall, hfinkel, jdoerfert Subscribers: guansong, arphaman, caomhin, kkwli0, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71708
* [ItaniumCXXABI] Don't mark an extern_weak init function as dso_local on windowsMartin Storsjö2019-12-231-1/+3
| | | | | | | | | Since 6bf108d77a3c, we try to not mark extern_weak symbols as dso_local, to allow using COFF stubs for references to those symbols (as the symbol may be missing, resolving to an absolute address zero, outside of the current DSO). Differential Revision: https://reviews.llvm.org/D71716
* reland "[DebugInfo] Support to emit debugInfo for extern variables"Yonghong Song2019-12-226-2/+52
| | | | | | | | | | | | | Commit d77ae1552fc21a9f3877f3ed7e13d631f517c825 ("[DebugInfo] Support to emit debugInfo for extern variables") added deebugInfo for extern variables for BPF target. The commit is reverted by 891e25b02d760d0de18c7d46947913b3166047e7 as the committed tests using %clang instead of %clang_cc1 causing test failed in certain scenarios as reported by Reid Kleckner. This patch fixed the tests by using %clang_cc1. Differential Revision: https://reviews.llvm.org/D71818
* Revert "[DebugInfo] Support to emit debugInfo for extern variables"Reid Kleckner2019-12-226-52/+2
| | | | | | | This reverts commit d77ae1552fc21a9f3877f3ed7e13d631f517c825. The tests committed along with this change do not pass, and should be changed to use %clang_cc1.
* [ms] [X86] Use "P" modifier on operands to call instructions in inline X86 ↵Eric Astor2019-12-221-0/+4
| | | | | | | | | | | | | | | | | | | | assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677
* [Driver] Verify -mrecord-mcount in Driver, instead of CodeGen after D71627Fangrui Song2019-12-211-8/+0
| | | | | | | | | | | GCC's x86 and s390 ports support -mrecord-mcount. Other ports reject the option. aarch64-linux-gnu-gcc: error: unrecognized command line option ‘-mrecord-mcount’ Allowing this option can cause failures when building Linux kernel for aarch64, powerpc64, etc, which will think the feature is available if the clang command returns 0.
* [objc_direct] Tigthen checks for direct methodsPierre Habouzit2019-12-201-9/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because the name of a direct method must be agreed upon by the caller and the implementation, certain bad practices that one can get away with when using dynamism are fatal with direct methods. To avoid really weird and unscruttable linker error, tighten the front-end error reporting. Rule 1: Direct methods can only have at most one declaration in an @interface container. Any redeclaration is strictly forbidden. Today some amount of redeclaration is tolerated between the main interface and categories for dynamic methods, but we can't have that. Rule 2: Direct method implementations can only be declared in a matching @interface container: when implemented in the primary @implementation then the declaration must be in the primary @interface or an extension, and when implemented in a category, the declaration must be in the @interface for the same category. Also fix another issue with ObjCMethod::getCanonicalDecl(): when an implementation lives in the primary @interface, then its canonical declaration can be in any extension, even when it's not an accessor. Add Sema tests to cover the new errors, and CG tests to beef up testing around function names for categories and extensions. Radar-Id: <rdar://problem/58054563> Differential Revision: https://reviews.llvm.org/D71694
* ConstrainedFP: use API compatible with opaque pointers.Tim Northover2019-12-191-6/+6
| | | | | | This just updates an IRBuilder interface to take Functions instead of Values so the type can be derived, and fixes some callsites in Clang to call the updated API.
* [Clang FE, SystemZ] Recognize -mrecord-mcount CL option.Jonas Paulsson2019-12-191-0/+11
| | | | | | | | | | Recognize -mrecord-mcount from the command line and add a function attribute "mrecord-mcount" when passed. Only valid on SystemZ (when used with -mfentry). Review: Ulrich Weigand https://reviews.llvm.org/D71627
* [WebAssembly] Add avgr_u intrinsics and require nuw in patternsThomas Lively2019-12-181-0/+8
| | | | | | | | | | | | | | | | | | Summary: The vector pattern `(a + b + 1) / 2` was previously selected to an avgr_u instruction regardless of nuw flags, but this is incorrect in the case where either addition may have an unsigned wrap. This CL changes the existing pattern to require both adds to have nuw flags and adds builtin functions and intrinsics for the avgr_u instructions because the corrected pattern is not representable in C. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71648
* Refactor CompareReferenceRelationship and its callers in preparation forRichard Smith2019-12-181-19/+34
| | | | | | | | | | | | | | implementing the resolution of CWG2352. No functionality change, except that we now convert the referent of a reference binding to the underlying type of the reference in more cases; we used to happen to preserve the type sugar from the referent if the only type change was in the cv-qualifiers. This exposed a bug in how we generate code for trivial assignment operators: if the type sugar (particularly the may_alias attribute) got lost during reference binding, we'd use the "wrong" TBAA information for the load during the assignment.
* Use hasOffsetApplied to initialize member HasOffsetAppliedAkira Hatanaka2019-12-181-1/+1
| | | | | This is NFC since none of the constructor calls in trunk pass hasOffsetApplied=true.
* [Clang FE, SystemZ] Don't add "true" value for the "mnop-mcount" attribute.Jonas Paulsson2019-12-181-1/+1
| | | | | | | | Let the "mnop-mcount" function attribute simply be present or non-present. Update SystemZ backend as well to use hasFnAttribute() instead. Review: Ulrich Weigand https://reviews.llvm.org/D71669
* [OPENMP50]Add parsing/sema analysis for nontemporal clause.Alexey Bataev2019-12-171-0/+1
| | | | | Add basic support for parsing/sema analysis of the nontemporal clause in simd-based directives.
* [Clang FE, SystemZ] Recognize -mpacked-stack CL optionJonas Paulsson2019-12-171-0/+8
| | | | | | | | | | | Recognize -mpacked-stack from the command line and add a function attribute "mpacked-stack" when passed. This is needed for building the Linux kernel. If this option is passed for any other target than SystemZ, an error is generated. Review: Ulrich Weigand https://reviews.llvm.org/D71441
* [ObjC][DWARF] Emit DW_AT_APPLE_objc_direct for methods marked as ↵Raphael Isemann2019-12-171-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | __attribute__((objc_direct)) Summary: With DWARF5 it is no longer possible to distinguish normal methods and methods with `__attribute__((objc_direct))` by just looking at the debug information as they are both now children of the of the DW_TAG_structure_type that defines them (before only the `__attribute__((objc_direct))` methods were children). This means that in LLDB we are no longer able to create a correct Clang AST of a module by just looking at the debug information. Instead we would need to call the Objective-C runtime to see which of the methods have a `__attribute__((objc_direct))` and then add the attribute to our own Clang AST depending on what the runtime returns. This would mean that we either let the module AST be dependent on the Objective-C runtime (which doesn't seem right) or we retroactively add the missing attribute to the imported AST in our expressions. A third option is to annotate methods with `__attribute__((objc_direct))` as `DW_AT_APPLE_objc_direct` which is what this patch implements. This way LLDB doesn't have to call the runtime for any `__attribute__((objc_direct))` method and the AST in our module will already be correct when we create it. Reviewers: aprantl, SouraVX Reviewed By: aprantl Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71201
* [c++20] P1959R0: Remove support for std::*_equality.Richard Smith2019-12-161-8/+0
|
* [WebAssembly] Replace SIMD int min/max builtins with patternsThomas Lively2019-12-161-43/+1
| | | | | | | | | | | | | | | | | | Summary: The instructions were originally implemented via builtins and intrinsics so users would have to explicitly opt-in to using them. This was useful while were validating whether these instructions should have been merged into the spec proposal. Now that they have been, we can use normal codegen patterns, so the intrinsics and builtins are no longer useful. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71500
* [TLI] Support for per-Function TLI that overrides available libfuncsTeresa Johnson2019-12-161-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Follow-on to D66428 and D71193, to build the TLI per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. With D71193, the -fno-builtin* flags are converted to function attributes, so we can now set this information per-function on the TLI. In this patch, the TLI constructor is changed to take a Function, which can be used to override the available builtins. The TLI is augmented with an array that can be used to specify which builtins are not available for the corresponding function. The available function checks are changed to consult this override before checking the underlying module level baseline TLII. New code is added to set this override array based on the attributes. I also removed the code that sets availability in the TLII in clang from the options, which is no longer needed. I removed a per-Triple caching of TLII objects in the analysis object, as it is based on the Module's Triple which is the same for all functions in any case. Is there a case where we would be compiling multiple Modules with different Triples in one compilation? Finally, I have changed the legacy analysis wrapper to create and use the new PM analysis class (TargetLibraryAnalysis) in getTLI. This is consistent with the behavior of getTTI for the legacy TargetTransformInfo analysis. This change means that getTLI now creates a new TLI on each call (although that should be very cheap as we cache the module level TLII, and computing the per-function attribute based availability should also be reasonably efficient). I measured the compile time for a large C++ file with tens of thousands of functions and as expected there was no increase. Reviewers: chandlerc, hfinkel, gchatelet Subscribers: mehdi_amini, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67923
* Reland [DataLayout] Fix occurrences that size and range of pointers are ↵Nicola Zaghen2019-12-131-2/+2
| | | | | | | | | | | | | | assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!
* [OPENMP50]Fix possible conflict when emitting an alias for the functionsAlexey Bataev2019-12-121-1/+1
| | | | | | | in declare variant. If the types of the fnction are not equal, but match, at the codegen thei may have different types. This may lead to compiler crash.
* [LTO] Support for embedding bitcode section during LTOTeresa Johnson2019-12-121-119/+4
| | | | | | | | | | | | | | | | | | | Summary: This adds support for embedding bitcode in a binary during LTO. The libLTO gains supports the `-lto-embed-bitcode` flag. The option allows users of the LTO library to embed a bitcode section. For example, LLD can pass the option via `ld.lld -mllvm=-lto-embed-bitcode`. This feature allows doing something comparable to `clang -c -fembed-bitcode`, but on the (LTO) linker level. Having bitcode alongside native code has many use-cases. To give an example, the MacOS linker can create a `-bitcode_bundle` section containing bitcode. Also, having this feature built into LLVM is an alternative to 3rd party tools such as [[ https://github.com/travitch/whole-program-llvm | wllvm ]] or [[ https://github.com/SRI-CSL/gllvm | gllvm ]]. As with these tools, this feature simplifies creating "whole-program" llvm bitcode files, but in contrast to wllvm/gllvm it does not rely on a specific llvm frontend/driver. Patch by Josef Eisl <josef.eisl@oracle.com> Reviewers: #llvm, #clang, rsmith, pcc, alexshap, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, mehdi_amini, inglorion, hiraditya, aheejin, steven_wu, dexonsmith, dang, cfe-commits, llvm-commits, #llvm, #clang Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68213
* [clang] Turn -fno-builtin flag into an IR AttributeGuillaume Chatelet2019-12-121-11/+14
| | | | | | | | | | | | | | Summary: This is a follow up on https://reviews.llvm.org/D61634#1742154 to turn the clang driver -fno-builtin flag into an IR attribute. I also investigated pushing the attribute earlier on (in Sema) but it looks like this patch is simple and will cover all function calls. Reviewers: aaron.ballman, courbet Subscribers: cfe-commits, tejohnson Tags: #clang Differential Revision: https://reviews.llvm.org/D71193
* [Alignment][NFC] Adding Align compatible methods to IntrinsicInst/IRBuilderGuillaume Chatelet2019-12-121-9/+9
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71420
* Temporarily Revert "[DataLayout] Fix occurrences that size and range of ↵Nicola Zaghen2019-12-121-2/+2
| | | | | | | | | pointers are assumed to be the same." This reverts commit 5f6208778ff92567c57d7c1e2e740c284d7e69a5. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
* [DataLayout] Fix occurrences that size and range of pointers are assumed to ↵Nicola Zaghen2019-12-121-2/+2
| | | | | | | | | | | | be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!
* [IR] Split out target specific intrinsic enums into separate headersReid Kleckner2019-12-114-2/+16
| | | | | | | | | | | | | | | | | | | | This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
* [OpenMP] Use the OpenMP-IR-BuilderJohannes Doerfert2019-12-114-2/+40
| | | | | | | | | | | | | | | This is a follow up patch to use the OpenMP-IR-Builder, as discussed on the mailing list ([1] and later) and at the US Dev Meeting'19. [1] http://lists.flang-compiler.org/pipermail/flang-dev_lists.flang-compiler.org/2019-May/000197.html Reviewers: kiranchandramohan, ABataev, RaviNarayanaswamy, gtbercea, grokos, sdmitriev, JonChesterfield, hfinkel, fghanim Subscribers: ppenzin, penzn, llvm-commits, cfe-commits, jfb, guansong, bollu, hiraditya, mgorny Tags: #clang Differential Revision: https://reviews.llvm.org/D69922
* Fix detection of __attribute__((may_alias)) to properly look throughRichard Smith2019-12-111-8/+9
| | | | | | | type sugar. We previously missed the attribute in a lot of cases in C++, because there's often other type sugar there (eg, ElaboratedType).
* [OPENMP50]Add if clause in teams distribute parallel for simd directive.Alexey Bataev2019-12-111-1/+2
| | | | | | According to OpenMP 5.0, if clause can be used in for simd directive. If condition in the if clause if false, the non-vectorized version of the loop must be executed.
* [WebAssembly] Add new `export_name` clang attribute for controlling wasm ↵Sam Clegg2019-12-111-0/+6
| | | | | | | | | | | | | | | | | | | | export names This is equivalent to the existing `import_name` and `import_module` attributes which control the import names in the final wasm binary produced by lld. This maps the existing This attribute currently requires a string rather than using the symbol name for a couple of reasons: 1. Avoid confusion with static and dynamic linking which is based on symbol name. Exporting a function from a wasm module using this directive is orthogonal to both static and dynamic linking. 2. Avoids name mangling. Differential Revision: https://reviews.llvm.org/D70520
* [Support] Add TimeTraceScope constructor without detail argRussell Gallop2019-12-112-5/+5
| | | | | | | This simplifies code where no extra details are required Also don't write out detail when it is empty. Differential Revision: https://reviews.llvm.org/D71347
* CodeGen: Allow annotations on globals in non-zero address spaceNicolai Hähnle2019-12-111-1/+7
| | | | | | | | | | | | | | | | | | | | | | Summary: Attribute annotations are recorded in a special global composite variable that points to annotation strings and the annotated objects. As a restriction of the LLVM IR type system, those pointers are all pointers to address space 0, so let's insert an addrspacecast when the annotated global is in a non-0 address space. Since this addrspacecast is only reachable from the global annotations object, this should allow us to represent annotations on all globals regardless of which addrspacecasts are usually legal for the target. Reviewers: rjmccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D71208
* [Clang] Pragma vectorize_width() implies vectorize(enable)Sjoerd Meijer2019-12-111-10/+11
| | | | | | | | | | | | | | | | | | | | | | Let's try this again; this has been reverted/recommited a few times. Last time this got reverted because for this loop: void a() { #pragma clang loop vectorize(disable) for (;;) ; } vectorisation was incorrectly enabled and the vectorize.enable metadata was set due to a logic error. But with this fixed, we now imply vectorisation when: 1) vectorisation is enabled, which means: VectorizeWidth > 1, 2) and don't want to add it when it is disabled or enabled, otherwise we would be incorrectly setting it or duplicating the metadata, respectively. This should fix PR27643. Differential Revision: https://reviews.llvm.org/D69628
* [ARM][MVE] Add intrinsics for immediate shifts. (reland)Simon Tatham2019-12-111-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen, MarkMurrayARM Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065
* NFC: Get rid of an unused parameter to CGObjCMac::EmitSelectorAddr.Erik Pilkington2019-12-101-12/+10
|
* Fix bug 44190 - wrong code with #pragma pack(1)Yaxun (Sam) Liu2019-12-101-3/+2
| | | | | | | | | | | https://github.com/llvm/llvm-project/commit/5b330e8d6122c336d81dfd11c864e6c6240a381e caused a regression on s390: https://bugs.llvm.org/show_bug.cgi?id=44190 we need to copy if if either the argument is non-byval or the argument is underaligned. Differential Revision: https://reviews.llvm.org/D71282
OpenPOWER on IntegriCloud