summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [Mips] Add support for min/max/umin/umax atomicsMirko Brkusanin2019-12-126-28/+385
| | | | | | | | In order to properly implement these atomic we need one register more than other binary atomics. It is used for storing result from comparing values in addition to the one that is used for actual result of operation. https://reviews.llvm.org/D71028
* [AArch64][SVE] Remove nxv1f32 and nxv1f64 as legal typesCullen Rhodes2019-12-123-31/+13
| | | | | | | | | | | | | | | Summary: Also cleans up ZPR register class definition. Reviewers: sdesmalen, cameron.mcinally, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71351
* Revert "[ARM][MVE] Sink vector shift operand"Sam Parker2019-12-122-28/+3
| | | | | | This reverts commit e0b966643fc2030442ffbae9b677247be697673b. Instruction selection is failing with expensive checks.
* [ARM][MVE] Sink vector shift operandSam Parker2019-12-122-3/+28
| | | | | | | | The shift amount operand can be provided in a general purpose register so sink it. Flip the vdup and negate so the existing patterns can be used for matching. Differential Revision: https://reviews.llvm.org/D70841
* [AArch64][SVE] Add patterns for scalable vselectCameron McInally2019-12-112-2/+12
| | | | | | This patch matches scalable vector selects to predicated move instructions. Differential Revision: https://reviews.llvm.org/D71298
* [IR] Split out target specific intrinsic enums into separate headersReid Kleckner2019-12-1146-19/+68
| | | | | | | | | | | | | | | | | | | | This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
* Rename TTI::getIntImmCost for instructions and intrinsicsReid Kleckner2019-12-1113-39/+41
| | | | | | | | | | | | | | Soon Intrinsic::ID will be a plain integer, so this overload will not be possible. Rename both overloads to ensure that downstream targets observe this as a build failure instead of a runtime failure. Split off from D71320 Reviewers: efriedma Differential Revision: https://reviews.llvm.org/D71381
* Revert "[SDAG] remove use restriction in isNegatibleForFree() when called ↵Sanjay Patel2019-12-112-7/+4
| | | | | | | from getNegatedExpression()" This reverts commit d1f0bdf2d2df9bdf11ee2ddfff3df50e53f2f042. The patch can cause infinite loops in DAGCombiner.
* [WebAssembly] Add new `export_name` clang attribute for controlling wasm ↵Sam Clegg2019-12-114-1/+36
| | | | | | | | | | | | | | | | | | | | export names This is equivalent to the existing `import_name` and `import_module` attributes which control the import names in the final wasm binary produced by lld. This maps the existing This attribute currently requires a string rather than using the symbol name for a couple of reasons: 1. Avoid confusion with static and dynamic linking which is based on symbol name. Exporting a function from a wasm module using this directive is orthogonal to both static and dynamic linking. 2. Avoids name mangling. Differential Revision: https://reviews.llvm.org/D70520
* Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=Off builds after D65958 ↵Fangrui Song2019-12-111-0/+2
| | | | and D70450
* Add intrinsics for unary narrowing operationsAndrzej Warzynski2019-12-112-8/+18
| | | | | | | | | | | | | | | | | | | | | Summary: The following intrinsics for unary narrowing operations are added: * @llvm.aarch64.sve.sqxtnb * @llvm.aarch64.sve.uqxtnb * @llvm.aarch64.sve.sqxtunb * @llvm.aarch64.sve.sqxtnt * @llvm.aarch64.sve.uqxtnt * @llvm.aarch64.sve.sqxtunt Reviewers: sdesmalen, rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71270
* [AArch64] Be more careful to skip debug operands in LdSt Optimizier.Florian Hahn2019-12-111-7/+10
| | | | This fixes crashes with $noreg operands.
* [SDAG] remove use restriction in isNegatibleForFree() when called from ↵Sanjay Patel2019-12-112-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch skips the use check during the rewrite phase. So we determine that some expression isNegatibleForFree (identically to without this patch), but during the rewrite, don't rely on use counts to decide how to create the optimal expression. Differential Revision: https://reviews.llvm.org/D70975
* [AArch64] Skip debug ops with regsOverlap in AArch64 LD/ST opt.Florian Hahn2019-12-111-1/+1
| | | | This fixes a crash when debug instructions are in between 2 stores.
* [X86] Erase dead LEA instruction after converting it to MOV in ↵Craig Topper2019-12-111-0/+3
| | | | FixupLEAPass::processInstrForSlow3OpLEA.
* [SystemZ] Fix 128-bit strict FMA expansion pre-z14Ulrich Weigand2019-12-111-6/+4
| | | | | | | | | | | | | Before z14, we did not have any FMA instruction for 128-bit floating-point, so the @llvm.fma.f128 intrinsic needs to be expanded to a libcall on those platforms. This worked correctly for regular FMA, but was implemented incorrectly for the strict version. This was not noticed because we did not have test coverage for this case. This patch fixes that incorrect expansion and adds the missing test cases.
* Revert "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores"Kerry McLaughlin2019-12-113-98/+2
| | | | | | | | This reverts commit 3f5bf35f868d1e33cd02a5825d33ed4675be8cb1 as it was causing build failures in llvm-clang-x86_64-expensive-checks: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/392 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1045
* [AArch64] Teach Load/Store optimizier to rename store operands for pairing.Florian Hahn2019-12-111-8/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases, we can rename a store operand, in order to enable pairing of stores. For store pairs, that cannot be merged because the first tored register is defined in between the second store, we try to find suitable rename register. First, we check if we can rename the given register: 1. The first store register must be killed at the store, which means we do not have to rename instructions after the first store. 2. We scan backwards from the first store, to find the definition of the stored register and check all uses in between are renamable. Along they way, we collect the minimal register classes of the uses for overlapping (sub/super)registers. Second, we try to find an available register from the minimal physical register class of the original register. A suitable register must not be 1. defined before FirstMI 2. between the previous definition of the register to rename 3. a callee saved register. We use KILL flags to clear defined registers while scanning from the beginning to the end of the block. This triggers quite often, here are the top changes for MultiSource, SPEC2000, SPEC2006 compiled with -O3 for iOS: Metric: aarch64-ldst-opt.NumPairCreated Program base patch diff test-suite...nch/fourinarow/fourinarow.test 2.00 39.00 1850.0% test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test 46.00 80.00 73.9% test-suite...chmarks/Olden/power/power.test 70.00 96.00 37.1% test-suite...cations/hexxagon/hexxagon.test 29.00 39.00 34.5% test-suite...nchmarks/McCat/05-eks/eks.test 100.00 132.00 32.0% test-suite.../Trimaran/enc-rc4/enc-rc4.test 46.00 59.00 28.3% test-suite...T2006/473.astar/473.astar.test 160.00 200.00 25.0% test-suite.../Trimaran/enc-md5/enc-md5.test 8.00 10.00 25.0% test-suite...telecomm-gsm/telecomm-gsm.test 113.00 139.00 23.0% test-suite...ediabench/gsm/toast/toast.test 113.00 139.00 23.0% test-suite...Source/Benchmarks/sim/sim.test 91.00 111.00 22.0% test-suite...C/CFP2000/179.art/179.art.test 41.00 49.00 19.5% test-suite...peg2/mpeg2dec/mpeg2decode.test 245.00 279.00 13.9% test-suite...marks/Olden/health/health.test 16.00 18.00 12.5% test-suite...ks/Prolangs-C/cdecl/cdecl.test 90.00 101.00 12.2% test-suite...fice-ispell/office-ispell.test 91.00 100.00 9.9% test-suite...oxyApps-C/miniGMG/miniGMG.test 430.00 465.00 8.1% test-suite...lowfish/security-blowfish.test 39.00 42.00 7.7% test-suite.../Applications/spiff/spiff.test 42.00 45.00 7.1% test-suite...arks/mafft/pairlocalalign.test 2473.00 2646.00 7.0% test-suite.../VersaBench/ecbdes/ecbdes.test 29.00 31.00 6.9% test-suite...nch/beamformer/beamformer.test 220.00 235.00 6.8% test-suite...CFP2000/177.mesa/177.mesa.test 2110.00 2252.00 6.7% test-suite...ve-susan/automotive-susan.test 109.00 116.00 6.4% test-suite...s-C/unix-smail/unix-smail.test 65.00 69.00 6.2% test-suite...CI_Purple/SMG2000/smg2000.test 1194.00 1265.00 5.9% test-suite.../Benchmarks/nbench/nbench.test 472.00 500.00 5.9% test-suite...oxyApps-C/miniAMR/miniAMR.test 248.00 262.00 5.6% test-suite...quoia/CrystalMk/CrystalMk.test 18.00 19.00 5.6% test-suite...rks/tramp3d-v4/tramp3d-v4.test 7331.00 7710.00 5.2% test-suite.../Benchmarks/Bullet/bullet.test 5651.00 5938.00 5.1% test-suite...ternal/HMMER/hmmcalibrate.test 750.00 788.00 5.1% test-suite...T2006/456.hmmer/456.hmmer.test 764.00 802.00 5.0% test-suite...ications/JM/ldecod/ldecod.test 1028.00 1079.00 5.0% test-suite...CFP2006/444.namd/444.namd.test 1368.00 1434.00 4.8% test-suite...marks/7zip/7zip-benchmark.test 4471.00 4685.00 4.8% test-suite...6/464.h264ref/464.h264ref.test 3122.00 3271.00 4.8% test-suite...pplications/oggenc/oggenc.test 1497.00 1565.00 4.5% test-suite...T2000/300.twolf/300.twolf.test 742.00 774.00 4.3% test-suite.../Prolangs-C/loader/loader.test 24.00 25.00 4.2% test-suite...0.perlbench/400.perlbench.test 1983.00 2058.00 3.8% test-suite...ications/JM/lencod/lencod.test 4612.00 4785.00 3.8% test-suite...yApps-C++/PENNANT/PENNANT.test 995.00 1032.00 3.7% test-suite...arks/VersaBench/dbms/dbms.test 54.00 56.00 3.7% Reviewers: efriedma, thegameg, samparker, dmgreen, paquette, evandro Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70450
* [AArch64][SVE] Add DAG combine rules for gather loads and sext/zextAndrzej Warzynski2019-12-113-48/+204
| | | | | | | | | | | | | | | | Summary: These changes allow us to support sign-extending gather loads with the exisiting intrinsics (i.e. @llvm.aarch64.sve.ld1.gather.*). Reviewers: sdesmalen, huntergr, kmclaughlin, efriedma, rengolin, rovka, dancgr, mgudim Reviewed By: sdesmalen Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential revision: https://reviews.llvm.org/D70812
* Revert "Reland [AArch64][MachineOutliner] Return address signing for ↵Oliver Stannard2019-12-111-288/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outlined functions" This reverts commit cec2d5c17457722113580251c8a045fa9aca9b1b. Reverting because this is still creating outlined functions with return address signing instructions with mismatches SP values. For example: int *volatile v; void foo(int x) { int a[x]; v = &a[0]; v = &a[0]; v = &a[0]; v = &a[0]; v = &a[0]; v = &a[0]; } void bar(int x) { int a[x]; v = 0; v = &a[0]; v = &a[0]; v = &a[0]; v = &a[0]; v = &a[0]; } This generates these two outlined functions, both of which modify SP between the paciasp and retaa instructions: $ clang --target=aarch64-arm-none-eabi -march=armv8.3-a -c test2.c -o - -S -Oz -mbranch-protection=pac-ret+leaf ... OUTLINED_FUNCTION_0: // @OUTLINED_FUNCTION_0 .cfi_sections .debug_frame .cfi_startproc // %bb.0: paciasp .cfi_negate_ra_state mov w8, w0 lsl x8, x8, #2 add x8, x8, #15 // =15 mov x9, sp and x8, x8, #0x7fffffff0 sub x8, x9, x8 mov x29, sp mov sp, x8 adrp x9, v retaa ... OUTLINED_FUNCTION_1: // @OUTLINED_FUNCTION_1 .cfi_startproc // %bb.0: paciasp .cfi_negate_ra_state str x8, [x9, :lo12:v] str x8, [x9, :lo12:v] str x8, [x9, :lo12:v] str x8, [x9, :lo12:v] str x8, [x9, :lo12:v] mov sp, x29 retaa
* [AArch64][SVE] Implement intrinsics for non-temporal loads & storesKerry McLaughlin2019-12-113-2/+98
| | | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000
* [ARM][LowOverheadLoops] Remove dead loop update instructions.Sjoerd Meijer2019-12-111-2/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007
* [ARM][MVE] Add intrinsics for immediate shifts. (reland)Simon Tatham2019-12-111-20/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: dmgreen, MarkMurrayARM Subscribers: echristo, hokein, rdhindsa, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065
* [NFC][PowerPC] Remove the dead conditions in the if(cond)QingShan Zhang2019-12-111-5/+1
|
* [PowerPC] Exploitate the Vector Integer Average InstructionsQingShan Zhang2019-12-111-0/+19
| | | | | | | | | | | PowerPC has instruction to do the semantics of this piece of code: vector int foo(vector int m, vector int n) { return (m + n + 1) >> 1; } This patch is adding the match rule to select it. Differential Revision: https://reviews.llvm.org/D71002
* [X86] Split v64i1 arguments into 2 v32i1s that will be promoted to v32i8 ↵Craig Topper2019-12-101-3/+17
| | | | | | under min-legal-vector-width=256 This is an improvement to 88dacbd43625cf7aad8a01c0c3b92142c4dc0970
* [FPEnv][X86] Constrained FCmp intrinsics enabling on X86Wang, Pengfei2019-12-119-94/+252
| | | | | | | | | | | | Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582
* [X86] Go back to considering v64i1 as a legal type under ↵Craig Topper2019-12-101-47/+32
| | | | | | | | | | min-legal-vector-width=256. Scalarize v64i1 arguments and shuffles under min-legal-vector-width=256. This reverts 3e1aee2ba717529b651a79ed4fc7e7147358043f in favor of a different approach. Scalarizing isn't great codegen, but making the type illegal was interfering with k constraint in inline assembly.
* [BPF] put not-section-attribute externs into BTF ".extern" data sectionYonghong Song2019-12-101-3/+6
| | | | | | | | | Currently for extern variables with section attribute, those BTF_KIND_VARs will not be placed in any DataSec. This is inconvenient as any other generated BTF_KIND_VAR belongs to one DataSec. This patch put these extern variables into ".extern" section so bpf loader can have a consistent processing mechanism for all data sections and variables.
* [RISCV] Improve assembler missing feature warningsSimon Cook2019-12-102-11/+33
| | | | | | | | This adds support for printing improved missing feature error messages from the assembler, which now indicates which feature caused the parse to fail. Differential Revision: https://reviews.llvm.org/D69899
* [ARM][MVE] Refactor complex vector intrinsics [NFCI]Mikhail Maltsev2019-12-102-197/+116
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch refactors instruction selection of the complex vector addition, multiplication and multiply-add intrinsics, so that it is now based on TableGen patterns rather than C++ code. It also changes the first parameter (halving vs non-halving) of the arm_mve_vcaddq IR intrinsic to match the corresponding instruction encoding, hence it requires some changes in the tests. The patch addresses David's comment in https://reviews.llvm.org/D71190 Reviewers: dmgreen, ostannard, simon_tatham, MarkMurrayARM Reviewed By: dmgreen Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71245
* [Alignment][NFC] CreateMemSet use MaybeAlignGuillaume Chatelet2019-12-101-3/+3
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71213
* [AArch64] Fix issues with large arrays on stackKiran Chandramohan2019-12-105-27/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496
* [AArch64][SVE] Add wide compare immediate patternsCullen Rhodes2019-12-101-0/+101
| | | | | | | | | | | | | | | | | | | | Summary: Recognize wide compares where the wide operand is a splat of a scalar value in the appropriate range and convert to the immediate variant of the instruction. Patch by Graham Hunter Reviewers: sdesmalen, efriedma, dancgr, rovka, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71009
* [BPF] Support to emit debugInfo for extern variablesYonghong Song2019-12-093-15/+54
| | | | | | | | | | | | | | | | | | | | | | | extern variable usage in BPF is different from traditional pure user space application. Recent discussion in linux bpf mailing list has two use cases where debug info types are required to use extern variables: - extern types are required to have a suitable interface in libbpf (bpf loader) to provide kernel config parameters to bpf programs. https://lore.kernel.org/bpf/CAEf4BzYCNo5GeVGMhp3fhysQ=_axAf=23PtwaZs-yAyafmXC9g@mail.gmail.com/T/#t - extern types are required so kernel bpf verifier can verify program which uses external functions more precisely. This will make later link with actual external function no need to reverify. https://lore.kernel.org/bpf/87eez4odqp.fsf@toke.dk/T/#m8d5c3e87ffe7f2764e02d722cb0d8cbc136880ed This patch added bpf support to consume such info into BTF, which can then be used by bpf loader. Function processFuncPrototypes() only adds extern function definitions into BTF. The functions with actual definition have been added to BTF in some other places. Differential Revision: https://reviews.llvm.org/D70697
* [NFC] Add { } to silence compiler warning [-Wmissing-braces].Huihui Zhang2019-12-091-1/+1
| | | | | | | ../llvm/lib/Target/PowerPC/PPCISelLowering.cpp:5371:37: warning: suggest braces around initialization of subobject [-Wmissing-braces] std::array<EVT, 2> ReturnTypes = {MVT::Other, MVT::Glue}; ^~~~~~~~~~~~~~~~~~~~~ { }
* add support for strict operation fpextend/fpround/fsqrt on X86 backendLiu, Chen32019-12-104-94/+86
| | | | Differential Revision: https://reviews.llvm.org/D71184
* Revert "[ARM][MVE] Add intrinsics for immediate shifts."Eric Christopher2019-12-091-32/+20
| | | | | | | | | | | | | | and two follow-on commits: one warning fix and one functionality. As it's breaking at least the lto bot: http://lab.llvm.org:8011/builders/clang-with-lto-ubuntu/builds/15132/steps/test-stage1-compiler/logs/stdio This reverts commits: 8d70f3c933a5b81a87a5ab1af0e3e98ee2cd7c67 ff4dceef9201c5ae3924e92f6955977f243ac71d d97b3e3e65cd77a81b39732af84a1a4229e95091
* [AArch64][SVE] Implement SPLAT_VECTOR for i1 vectors.Eli Friedman2019-12-091-13/+18
| | | | | | | | | | The generated sequence with whilelo is unintuitive, but it's the best I could come up with given the limited number of SVE instructions that interact with scalar registers. The other sequence I was considering was something like dup+cmpne, but an extra scalar instruction seems better than an extra vector instruction. Differential Revision: https://reviews.llvm.org/D71160
* [PowerPC] [NFC] Cleanup xxpermdi peephole optimizationJinsong Ji2019-12-091-109/+113
| | | | | | | | | | | | | | | | | | | Summary: Following on from rG884351547da2, this patch cleans up the logic for `xxpermdi` peephole optimizations by converting two layers of nested `if`s to early breaks and simplifying the logic. Reviewers: hfinkel, nemanjai, jsji, lkail, #powerpc, steven.zhang Reviewed By: #powerpc, steven.zhang Subscribers: wuzish, steven.zhang, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71170 Patch by vddvss (Colin Samples).
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-093-2/+52
| | | | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
* [PowerPC][NFC] Rename ANDI(S)o8 to ANDI(S)8oJinsong Ji2019-12-097-26/+26
| | | | | | | | | | | | | | | | | | | Summary: This is found during https://reviews.llvm.org/D70758 All the other record forms are having suffix o at the end. ANDIo8 and ANDISo8 are the only two that put o before 8. This patch rename them to be consistent with others. Reviewers: #powerpc, hfinkel, nemanjai, lei, steven.zhang, echristo, jhibbits, joerg Reviewed By: jhibbits Subscribers: wuzish, hiraditya, kbarton, shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70928
* [ARM][MVE][Intrinsics] Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, ↵Mark Murray2019-12-091-68/+184
| | | | | | | | | | | | | | VQDMULHQ, VQRDMULHQ intrinsics. Summary: Add VQADDQ, VHADDQ, VRHADDQ, VQSUBQ, VHSUBQ, VQDMULHQ, VQRDMULHQ intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen, miyuki Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71198
* [ARM][MVE][Intrinsics] Add VMULL[BT]Q_(INT|POLY) intrinsics.Mark Murray2019-12-092-38/+96
| | | | | | | | | | | | Summary: Add VMULL[BT]Q_(INT|POLY) intrinsics and unit tests. Reviewers: simon_tatham, ostannard, dmgreen Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71066
* [PowerPC] Refactor FinishCall. [NFC]Sean Fertile2019-12-092-309/+341
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor FinishCall to be more easily understandable as a precursor to implementing indirect calls for AIX. The refactor tries to group similar code together at the cost of some code duplication. The high level overview of the refactor: - Adds a number of helper functions for things like: * Determining if a call is indirect. * What the Opcode for a call is. * Transforming the callee for a direct function call. * Extracting the Chain operand from a CallSeqStart node. * Building the operands of the call. - Adds helpers for building the indirect call DAG nodes (excluding the call instruction itself which is created in `FinishCall`). - Removes PrepareCall, which has been subsumed by the helpers. - Rename 'InFlag' to 'Glue'. - FinishCall has been refactored to: 1) Set TOC pointer usage on the DAG for the TOC based subtargets. 2) Calculate if a call is indirect. 3) Determine the Opcode to use for the call instruction. 4) Transform the Callee for direct calls, or build the DAG nodes for indirect calls. 5) Buildup the call operands. 6) Emit the call instruction. 7) If needed, emit the callSeqEnd Node and finish lowering by calling `LowerCallResult` Differential Revision: https://reviews.llvm.org/D70126
* [ARM] Fix NEON failure introduced by D71065.Simon Tatham2019-12-091-3/+5
| | | | | | | I rewrote the isel tablegen for MVE immediate shifts, and accidentally removed the `let Predicates=[HasMVEInt]` that was wrapping the old version, which seems to have allowed those rules to cause trouble on non-MVE targets. That's what I get for only re-running the MVE tests.
* [ARM][MVE] Add intrinsics for immediate shifts.Simon Tatham2019-12-091-22/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds the family of `vshlq_n` and `vshrq_n` ACLE intrinsics, which shift every lane of a vector left or right by a compile-time immediate. They mostly work by expanding to the IR `shl`, `lshr` and `ashr` operations, with their second operand being a vector splat of the immediate. There's a fiddly special case, though. ACLE specifies that the immediate in `vshrq_n` can take values up to //and including// the bit size of the vector lane. But LLVM IR thinks that shifting right by the full size of the lane is UB, and feels free to replace the `lshr` with an `undef` half way through the optimization pipeline. Hence, to keep this legal in source code, I have to detect it at codegen time. Logical (unsigned) right shifts by the element size are handled by simply emitting the zero vector; arithmetic ones are converted into a shift of one bit less, which will always give the same output. In order to do that check, I also had to enhance the tablegen MveEmitter so that it can cope with converting a builtin function's operand into a bare integer to pass to a code-generating subfunction. Previously the only bare integers it knew how to handle were flags generated from within `arm_mve.td`. Reviewers: dmgreen, miyuki, MarkMurrayARM, ostannard Reviewed By: MarkMurrayARM Subscribers: kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71065
* [RISCV] Machine Operand Flag SerializationSam Elliott2019-12-093-13/+51
| | | | | | | | | | | | | | | | Summary: These hooks ensure that the RISC-V backend can serialize and parse MIR correctly. Reviewers: jrtc27, luismarques Reviewed By: luismarques Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, Jim, s.egerton, pzheng, sameer.abuasal, apazos, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70666
* [ARM][MVE] Add complex vector intrinsicsMikhail Maltsev2019-12-091-0/+178
| | | | | | | | | | | | | | | | | | | | Summary: This patch adds intrinsics for the following MVE instructions: * VCADD, VHCADD * VCMUL * VCMLA Each of the above 3 groups has a corresponding new LLVM IR intrinsic. Reviewers: simon_tatham, MarkMurrayARM, ostannard, dmgreen Reviewed By: MarkMurrayARM Subscribers: merge_guards_bot, kristof.beyls, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71190
* [ARM] Enable MVE masked loads and storesDavid Green2019-12-091-1/+1
| | | | | | | With the extra optimisations we have done, these should now be fine to enable by default. Which is what this patch does. Differential Revision: https://reviews.llvm.org/D70968
OpenPOWER on IntegriCloud