summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
* [WebAssembly] Fix the opcode number for i64.load16_u.Dan Gohman2018-05-171-1/+1
| | | | | | Fixes PR37488. llvm-svn: 332561
* [X86][SNB] Remove unnecessary CVT InstRW overridesSimon Pilgrim2018-05-161-82/+24
| | | | llvm-svn: 332536
* [MachineOutliner] Don't outline instructions that modify SP.Eli Friedman2018-05-161-0/+8
| | | | | | | | | | | | | This breaks the code which saves and restores LR, so we can't outline without doing something more complicated for stack adjustment. Found by inspection; we get lucky in most cases because getMemOpInfo only handles STRWpost, not any other pre/post-increment forms. But it hits a couple of artificial testcases in the tree. Differential Revision: https://reviews.llvm.org/D46920 llvm-svn: 332529
* [Hexagon] Fix the order of operands when selecting QCATKrzysztof Parzyszek2018-05-161-2/+2
| | | | llvm-svn: 332526
* [Hexagon] Mark HVX vector predicate bitwise ops as legal, add patternsKrzysztof Parzyszek2018-05-162-35/+73
| | | | llvm-svn: 332525
* [X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)Simon Pilgrim2018-05-161-16/+39
| | | | | | | | | | As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure. Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs. Differential Revision: https://reviews.llvm.org/D46959 llvm-svn: 332524
* AMDGPU : Recalculate SGPRs when trap handler is supportedKonstantin Zhuravlyov2018-05-162-6/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D29911 llvm-svn: 332523
* Fix up a misleading format warning.Eric Christopher2018-05-161-1/+1
| | | | llvm-svn: 332521
* [MachineOutliner] Don't save/restore LR for tail calls.Eli Friedman2018-05-161-3/+4
| | | | | | | | | The cost computation assumes we do this correctly, but the actual lowering was wrong. Differential Revision: https://reviews.llvm.org/D46923 llvm-svn: 332514
* [X86] Fix typo in instregex for CVTSI642SDrrSimon Pilgrim2018-05-161-1/+1
| | | | llvm-svn: 332510
* [X86][AVX512DQ] Use packed instructions for scalar FP<->i64 conversions on ↵Craig Topper2018-05-161-8/+62
| | | | | | | | | | | | 32-bit targets As i64 types are not legal on 32-bit targets, insert these into a suitable zero vector and use the packed vXi64<->FP conversion instructions instead. Fixes PR3163. Differential Revision: https://reviews.llvm.org/D43441 llvm-svn: 332498
* [AMDGPU] Change llvm.debugtrap to be a debug breakpoint that can resume ↵Tony Tye2018-05-162-34/+34
| | | | | | | | | | execution. No longer require the queue pointer to be passed in in fixed SGPRs. Differential Revision: https://reviews.llvm.org/D46769 llvm-svn: 332485
* [AArch64][SVE] Improve diagnostics for vectors with incorrect element-size.Sander de Smalen2018-05-162-11/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For regular SVE vector operands, this patch introduces a more sensible diagnostic when the vector has a wrong suffix (e.g. z0.s vs z0.b). For example: add z0.s, z1.s, z2.b -> invalid element width ^_____^ mismatch For the vector-with-shift/extend (e.g. z0.s, uxtw #2) this patch takes a slightly different approach and instead returns a 'invalid operand' if the element size is not as expected. This is because the diagnostics are more specificied to suggest using the right shift/extend suffix. This is a trade-off not to introduce more operand classes and still provide useful diagnostics for LD1 and PRF instructions. For example: ld1w z1.s, p0/z, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw|sxtw)' ld1w z1.d, p0/z, [x0, z0.s] -> invalid operand ^________________^ mismatch For gather prefetches, both 'z0.s' and 'z0.d' would be allowed: prfw #0, p0, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].s, (uxtw|sxtw) #2' prfw #0, p0, [x0, z0.d] -> invalid shift/extend specified, expected 'z[0..31].d, (lsl|uxtw|sxtw) #2' Without this change, the diagnostic would unnecessarily suggest a different element size: prfw #0, p0, [x0, z0.s] -> invalid shift/extend specified, expected 'z[0..31].d, (lsl|uxtw|sxtw) #2' Reviewers: SjoerdMeijer, aemerson, fhahn, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46688 llvm-svn: 332483
* [AArch64] Gangup loads and stores for pairing.Sirish Pande2018-05-161-0/+2
| | | | | | | | | | Keep loads and stores together (target defines how many loads and stores to gang up), such that it will help in pairing and vectorization. Differential Revision https://reviews.llvm.org/D46477 llvm-svn: 332482
* [AArch64][SVE] Asm: Support for gather PRF prefetch instructionsSander de Smalen2018-05-162-0/+159
| | | | | | | | | | Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46686 llvm-svn: 332472
* [mips] Simplify some of the predicate scopes for (negative) multiply add/sub ↵Simon Dardis2018-05-161-23/+20
| | | | | | instructions (NFCI) llvm-svn: 332464
* [mips] Join existing scopes for DecoderNamespace (NFCI)Simon Dardis2018-05-161-6/+3
| | | | llvm-svn: 332462
* AMDGPU: Custom lower v4i16/v4f16 vector operationsMatt Arsenault2018-05-164-19/+124
| | | | | | | | | Avoids stack access. Also handle extract hi elt pattern from truncate + shift to avoid a couple test regressions. llvm-svn: 332453
* [X86] Split WriteCvtI2F/WriteCvtF2I into I<->F32 and I<->F64 scheduler classesSimon Pilgrim2018-05-1614-359/+338
| | | | | | A lot of the models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332451
* [GlobalISel][IRTranslator] Split aggregates during IR translation.Amara Emerson2018-05-162-1/+9
| | | | | | | | | | | | | | | | | | | | | We currently handle all aggregates by creating one large LLT, and letting the legalizer deal with splitting them up. However using this approach means that we can't support big endian code correctly. This patch changes the way that the IRTranslator deals with aggregate values, by splitting them up into their constituent element values. To do this, parts of the translator need to be modified to deal with multiple VRegs for a single Value. A new Value to VReg mapper is introduced to help keep compile time under control, currently there is no measurable impact on CTMark despite the extra code being generated in some cases. Patch is based on the original work of Tim Northover. Differential Revision: https://reviews.llvm.org/D46018 llvm-svn: 332449
* [mips] Add support for isBranchOffsetInRange and use it for MipsLongBranchSimon Dardis2018-05-164-14/+205
| | | | | | | | | | | | Add support for this target hook, covering MIPS, microMIPS and MIPSR6, along with some tests. Also add missing getOppositeBranchOpc() cases exposed by the tests. Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46794 llvm-svn: 332446
* [AArch64] Support "S" inline assembler constraintPeter Smith2018-05-162-1/+25
| | | | | | | | | | | | | | | | | | | | | This patch re-introduces the "S" inline assembler constraint. This matches an absolute symbolic address or a label reference. The primary use case is asm("adrp %0, %1\n\t" "add %0, %0, :lo12:%1" : "=r"(addr) : "S"(&var)); I say re-introduces as it seems like "S" was implemented in the original AArch64 backend, but it looks like it wasn't carried forward to the merged backend. The original implementation had A and L modifiers that could be used to print ":lo12:" to the string. It looks like gcc doesn't use these and :lo12: is expected to be written in the inline assembly string so I've not implemented A and L. Clang already supports the S modifier. Fixes PR37180 Differential Revision: https://reviews.llvm.org/D46745 llvm-svn: 332444
* [AArch64][SVE] Asm: Support for structured LD2, LD3 and LD4 (scalar+scalar) ↵Sander de Smalen2018-05-162-0/+36
| | | | | | | | | | | | load instructions. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D46679 llvm-svn: 332442
* [AArch64][SVE] Asm: Support for contiguous PRF prefetch instructions.Sander de Smalen2018-05-164-2/+77
| | | | | | | | | | Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D46682 llvm-svn: 332433
* Remove unused variable introduced in r332336Mikael Holmen2018-05-161-1/+1
| | | | | | | | | | | The unused variable caused a compilation warning: ../lib/Target/X86/X86ISelLowering.cpp:34614:17: error: unused variable 'SMax' [-Werror,-Wunused-variable] if (SDValue SMax = MatchMinMax(SMin, ISD::SMAX, C1)) ^ 1 error generated. llvm-svn: 332431
* ARM: Remove unnecessary argument. NFCI.Peter Collingbourne2018-05-162-6/+3
| | | | | | IsLittleEndian is already a field of ARMAsmBackend. llvm-svn: 332420
* ARM: Deduplicate code and remove unnecessary declaration. NFCI.Peter Collingbourne2018-05-163-47/+11
| | | | llvm-svn: 332419
* [AMDGPU] Fix handling of void types in isLegalAddressingModeStanislav Mekhanoshin2018-05-151-1/+1
| | | | | | | | | | | | | It is legal for the type passed to isLegalAddressingMode to be unsized or, more specifically, VoidTy. In this case, we must check the legality of load / stores for all legal types. Directly trying to call getTypeStoreSize is incorrect, and leads to breakage in e.g. Loop Strength Reduction. This change guards against that behaviour. Differential Revision: https://reviews.llvm.org/D40405 llvm-svn: 332409
* [AArch64] Improve single vector lane unscaled storesEvandro Menezes2018-05-151-0/+16
| | | | | | | | | | When storing the 0th lane of a vector, use a simpler and usually more efficient scalar store instead. In this case, also using the unscaled offset. Differential revision: https://reviews.llvm.org/D46762 llvm-svn: 332394
* Nios2: Unbreak build.Peter Collingbourne2018-05-152-5/+6
| | | | llvm-svn: 332391
* [x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering toChandler Carruth2018-05-151-3/+142
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | specially handle SETB_C* pseudo instructions. Summary: While the logic here is somewhat similar to the arithmetic lowering, it is different enough that it made sense to have its own function. I actually tried a bunch of different optimizations here and none worked well so I gave up and just always do the arithmetic based lowering. Looking at code from the PR test case, we actually pessimize a bunch of code when generating these. Because SETB_C* pseudo instructions clobber EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple SETB_C* pseudos from a single set of EFLAGS. This in turn causes the lowering code to ruin all the clever code generation that SETB_C* was hoping to achieve. None of this is needed. Whenever we're generating multiple SETB_C* instructions from a single set of EFLAGS we should instead generate a single maximally wide one and extract subregs for all the different desired widths. That would result in substantially better code generation. But this patch doesn't attempt to address that. The test case from the PR is included as well as more directed testing of the specific lowering pattern used for these pseudos. Reviewers: craig.topper Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D46799 llvm-svn: 332389
* AMDGPU: Fix v_dot{4, 8}* instruction encodingKonstantin Zhuravlyov2018-05-152-8/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D46848 llvm-svn: 332387
* AMDGPU/GlobalISel: Implement select() for G_FCONSTANTTom Stellard2018-05-151-15/+47
| | | | | | | | | | | | Summary: Also clean up G_CONSTANT selection. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46170 llvm-svn: 332379
* AMDGPU: Add disasm tests for deep learning instructions + fix v_fmac_f32 disasmKonstantin Zhuravlyov2018-05-151-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D46853 llvm-svn: 332377
* [X86] Split WriteCvtF2F into F32->F64 and F64->F32 scheduler classesSimon Pilgrim2018-05-1512-136/+181
| | | | | | | | BtVer2 - Fixes schedules for (V)CVTPS2PD instructions A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first llvm-svn: 332376
* [Hexagon] Remove unused function from subtargetKrzysztof Parzyszek2018-05-151-8/+0
| | | | llvm-svn: 332369
* [Hexagon] Remove unused flag from subtarget and (non)corresponding testKrzysztof Parzyszek2018-05-153-8/+0
| | | | llvm-svn: 332365
* [mips] Mark select instructions correctlySimon Dardis2018-05-153-151/+192
| | | | | | | | Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46702 llvm-svn: 332364
* [X86] Split off F16C WriteCvtPH2PS/WriteCvtPS2PH scheduler classesSimon Pilgrim2018-05-1512-139/+121
| | | | | | | | | Btver2 - VCVTPH2PSYrm needs to double pump the AGU Broadwell - missing VCVTPS2PH*mr stores extra latency Allows us to remove the WriteCvtF2FSt conversion store class llvm-svn: 332357
* [mips] Fix formatting of floating point conversion patternsSimon Dardis2018-05-151-8/+8
| | | | llvm-svn: 332341
* [mips] Add disassembly support for comparison instructionsSimon Dardis2018-05-151-4/+6
| | | | llvm-svn: 332340
* [mips] Fix predicates of mfc1, mtc1 instructionsSimon Dardis2018-05-152-28/+22
| | | | | | | | Reviewers: atanasyan, abeserminji, smaksimovic Differential Revision: https://reviews.llvm.org/D46692 llvm-svn: 332339
* [X86] Improve unsigned saturation downconvert detection.Artur Gainullin2018-05-151-19/+52
| | | | | | | | | | | | | | | | | | | | | | | Summary: New unsigned saturation downconvert patterns detection was implemented in X86 Codegen: (truncate (smin (smax (x, C1), C2)) to dest_type), where C1 >= 0 and C2 is unsigned max of destination type. (truncate (smax (smin (x, C2), C1)) to dest_type) where C1 >= 0, C2 is unsigned max of destination type and C1 <= C2. These two patterns are equivalent to: (truncate (umin (smax(x, C1), unsigned_max_of_dest_type)) to dest_type) Reviewers: RKSimon Subscribers: llvm-commits, a.elovikov Differential Revision: https://reviews.llvm.org/D45315 llvm-svn: 332336
* [RISCV] Define FeatureRelax and shouldForceRelocation for RISCV linker ↵Shiva Chen2018-05-153-0/+14
| | | | | | | | | | | | | | relaxation 1. Deine FeatureRelax to enable/disable linker relaxation. 2. Define shouldForceRelocation to preserve relocation types even if the fixup can be resolved when linker relaxation enabled. This is necessary for correctness as offsets may change during relaxation. Differential Revision: https://reviews.llvm.org/D46674 llvm-svn: 332318
* [ARM] Back up R4 and LR if calling the stack probe functionMartin Storsjo2018-05-141-0/+11
| | | | | | Differential Revision: https://reviews.llvm.org/D46777 llvm-svn: 332298
* [Hexagon] Add a target feature to control using small data sectionKrzysztof Parzyszek2018-05-144-16/+20
| | | | llvm-svn: 332292
* [Hexagon] Add a target feature for generating new-value storesKrzysztof Parzyszek2018-05-143-6/+18
| | | | llvm-svn: 332290
* [Hexagon] Add a target feature for memop generationKrzysztof Parzyszek2018-05-144-31/+32
| | | | llvm-svn: 332285
* [X86] Add NT load/store scheduler classesSimon Pilgrim2018-05-1413-84/+148
| | | | llvm-svn: 332274
* [X86] Remove and autoupgrade avx512.vbroadcast.ss/avx512.vbroadcast.sd ↵Craig Topper2018-05-141-5/+0
| | | | | | intrinsics. llvm-svn: 332271
OpenPOWER on IntegriCloud