summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [AMDGPU] Optimize S_CBRANCH_VCC[N]Z -> S_CBRANCH_EXEC[N]ZStanislav Mekhanoshin2018-11-121-0/+97
| | | | | | | | | | | | | | | | | | | Sometimes after basic block placement we end up with a code like: sreg = s_mov_b64 -1 vcc = s_and_b64 exec, sreg s_cbranch_vccz This happens as a join of a block assigning -1 to a saved mask and another block which consumes that saved mask with s_and_b64 and a branch. This is essentially a single s_cbranch_execz instruction when moved into a single new basic block. Differential Revision: https://reviews.llvm.org/D54164 llvm-svn: 346690
* [CostModel][X86] Add funnel shift rotation special case costsSimon Pilgrim2018-11-121-1/+82
| | | | | | When we repeat the 2 shifting operands then this is a bit rotation - annoyingly this has to be done in the other getIntrinsicInstrCost than most intrinsics as we need to check the operands are the same. llvm-svn: 346688
* [CostModel][X86] Add SHLD/SHRD scalar funnel shift costsSimon Pilgrim2018-11-121-2/+11
| | | | | | The costs match the typical reg-reg cases - the RMW case can be a lot slower but we don't model that at this level llvm-svn: 346683
* [CostModel][X86] SK_ExtractSubvector is cheap if the (legal) subvector is ↵Simon Pilgrim2018-11-121-5/+13
| | | | | | aligned within the source vector llvm-svn: 346664
* [SystemZ::TTI] Improve accuracy of costs for vector fp <-> int conversionsJonas Paulsson2018-11-121-1/+2
| | | | | | | | | | | | | | Improve getCastInstrCost() by respecting the different types of Src and Dst for vector integer <-> fp conversions. This means that extracting from integer becomes more expensive (by the extraction penalty), and the extraction from fp becomes cheaper (no longer has a false extraction penalty). Review: Ulrich Weigand https://reviews.llvm.org/D54423 llvm-svn: 346663
* [RISCV] Support .option relax and .option norelaxAlex Bradbury2018-11-127-99/+177
| | | | | | | | | | | | | | | | | | | | | | This extends the .option support from D45864 to enable/disable the relax feature flag from D44886 During parsing of the relax/norelax directives, the RISCV::FeatureRelax feature bits of the SubtargetInfo stored in the AsmParser are updated appropriately to reflect whether relaxation is currently enabled in the parser. When an instruction is parsed, the parser checks if relaxation is currently enabled and if so, gets a handle to the AsmBackend and sets the ForceRelocs flag. The AsmBackend uses a combination of the original RISCV::FeatureRelax feature bits set by e.g -mattr=+/-relax and the ForceRelocs flag to determine whether to emit relocations for symbol and branch diffs. Diff relocations should therefore only not be emitted if the relax flag was not set on the command line and no instruction was ever parsed in a section with relaxation enabled to ensure correct diffs are emitted. Differential Revision: https://reviews.llvm.org/D46423 Patch by Lewis Revill. llvm-svn: 346655
* [SystemZ] Replicate the load with most uses in buildVector()Jonas Paulsson2018-11-121-8/+11
| | | | | | | | | | | Iterate over all elements and count the number of uses among them for each used load. Then make sure to REPLICATE the load which has the most uses in order to minimize the number of needed element insertions. Review: Ulrich Weigand https://reviews.llvm.org/D54322 llvm-svn: 346637
* [X86] Use DAG.getConstant instead of getZeroVector.Craig Topper2018-11-111-1/+1
| | | | llvm-svn: 346605
* [X86] Replace calls to getOnesVector/getZeroVector with getConstant.Craig Topper2018-11-111-2/+2
| | | | | | getConstant will create a BUILD_VECTOR for us and use a legal type if necessary. So just create the simple node and let BUILD_VECTOR legalization do the canonicalization. llvm-svn: 346603
* [x86] allow vector load narrowing with multi-use valuesSanjay Patel2018-11-103-1/+12
| | | | | | | | | | | | | | | | | | | | | | This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs. Apart from 2-3 strange cases, these are all wins. I've structured this to be no-functional-change-intended for any target except for x86 because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those targets have existing regression tests (4, 4, 10 files respectively) that would be affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show any regression test diffs. The trade-off is deciding if an extra vector load is better than a single wide load + extract_subvector. For x86, this is almost always better (on paper at least) because we often can fold loads into subsequent ops and not increase the official instruction count. There's also some unknown -- but potentially large -- benefit from using narrower vector ops if wide ops are implemented with multiple uops and/or frequency throttling is avoided. Differential Revision: https://reviews.llvm.org/D54073 llvm-svn: 346595
* [X86] Remove unused variableBenjamin Kramer2018-11-101-1/+0
| | | | llvm-svn: 346592
* [X86] Remove apparently unneeded code from combineVSZext.Craig Topper2018-11-101-50/+0
| | | | | | | | No lit tests fail with this code removed. This is a pre-commit for D54346. llvm-svn: 346590
* [CostModel][X86] SK_ExtractSubvector costs must only be tested for vector ↵Simon Pilgrim2018-11-101-1/+1
| | | | | | types (PR39615) llvm-svn: 346589
* [X86][BdVer2] Fix loads/stores throughput for Piledriver (PR39465)Roman Lebedev2018-11-101-4/+4
| | | | | | | | | | | | | There are two AGU units, and per 1cy, there can be either two loads, or a load and a store; but not two stores, or two loads and a store. Additionally, loads shouldn't affect the store scheduler and vice versa. (but *should* affect the PdEX scheduler.) Required rL346545. Fixes https://bugs.llvm.org/show_bug.cgi?id=39465 llvm-svn: 346587
* [X86] Use a MOVSX instruction instead of a MOVZX instruction in isel for an ↵Craig Topper2018-11-101-0/+9
| | | | | | | | any_extend of the remainder from an 8-bit sdivrem. The sdivrem will emit its own MOVSX to move %ah to the low byte of a register. By using a MOVSX for an any_extend this allows a post-isel peephole to merge them. llvm-svn: 346581
* [X86] In LowerHorizontalByteSum, emit vector_shuffle nodes instead of ↵Craig Topper2018-11-101-5/+5
| | | | | | | | | | directly using X86ISD::UNPCKL/X86ISD::UNPCKH. This gives shuffle lowering the freedom to use zero_extend_vector_inreg for the unpckl shuffle. Shuffle combining usually makes this swap later, but not when AVX512 is enabled it seems. While there also use DAG.getConstant to create a 0 vector instead of using the helper the forces a specific BUILD_VECTOR. I don't think that helper is usually needed. We're basically free to create a constant build_vector anytime and it will be legalized on its own. llvm-svn: 346574
* [WebAssembly] Update bleeding-edge cpu featuresThomas Lively2018-11-101-1/+2
| | | | | | | | | | Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54362 llvm-svn: 346570
* [ARM64] [Windows] Handle funcletsEli Friedman2018-11-099-20/+243
| | | | | | | | | | | | This patch adds support for funclets in frame lowering and ISel lowering. Together with D50288 and D50166, it enables C++ exception handling. Patch by Sanjin Sijaric, with some fixes by me. Differential Revision: https://reviews.llvm.org/D51524 llvm-svn: 346568
* [ARM] Add MemOperand to LDRcp to enable DCE.Eli Friedman2018-11-091-1/+6
| | | | | | | | | | | | LDRcp should be deleted when the dest register is dead in register coalescing. Without MemOp, dead LDRcp will cause dead constant pool value which references to non-existing label. Patch by Yin Ma. Differential Revision: https://reviews.llvm.org/D54173 llvm-svn: 346563
* [X86] Move the promotion of v16i16->v16i8 for avx512f but not avx512bw from ↵Craig Topper2018-11-092-8/+18
| | | | | | | | | | lowering to isel. Change to use vpmovzx instead of vpmovsx. With avx512f but not avx512bw we need to extend to v16i32 then truncate that to to v16i8. Previously we emitted both nodes during lowering, but I'm trying to switch to using target independent nodes and with that switched the extend+truncate wou This patch changes the implementation to what will be necessary with that patch which helps minimize test diffs. llvm-svn: 346552
* [AArch64] Support HiSilicon's TSV110 processorBryan Chan2018-11-093-1/+24
| | | | | | | | | | | | Reviewers: t.p.northover, SjoerdMeijer, kristof.beyls Reviewed By: kristof.beyls Subscribers: olista01, javed.absar, kristof.beyls, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D53908 llvm-svn: 346546
* [Hexagon] Fix some -Wunused-function with LLVM_DUMP_METHOD and -Wunused-variableFangrui Song2018-11-092-4/+9
| | | | llvm-svn: 346543
* [X86] Turn X86ISD::VSEXT into X86ISD::VZEXT if the upper bits aren't demanded.Craig Topper2018-11-091-0/+12
| | | | | | | | This makes X86ISD::VSEXT more similar to ISD::SIGN_EXTEND and ISD::ZERO_EXTEND. I'm hoping to replace X86ISD::VSEXT/VZEXT with target independent nodes. Making the target specific nodes similar to the target independent nodes helps minimize test diffs in that patch. llvm-svn: 346539
* [CostModel][X86] SK_ExtractSubvector is free if the subvector is at the ↵Simon Pilgrim2018-11-091-181/+187
| | | | | | start of the source vector llvm-svn: 346538
* [Hexagon] Fix unused variable warning in release buildsJordan Rupprecht2018-11-091-0/+1
| | | | llvm-svn: 346537
* [WebAssembly] Hotfix of WebAssemblyInstructionTableSize after rL346465Fangrui Song2018-11-091-0/+2
| | | | llvm-svn: 346535
* [Hexagon] Implement noreturn optimizationBrendon Cahoon2018-11-094-1/+39
| | | | | | | | | | | Eliminate the stack frame in functions with the noreturn nounwind attributes, and when the noreturn-stack-elim target feature is enabled. This reduces the code and stack space needed for noreturn functions. Differential Revision: https://reviews.llvm.org/D54210 llvm-svn: 346532
* [AMDGPU] Always pass TRI into findRegister[Use/Def]OperandIdxStanislav Mekhanoshin2018-11-094-7/+10
| | | | | | | | This only covers AMDGPU BE, hopefully all occurrences. Differential Revision: https://reviews.llvm.org/D54235 llvm-svn: 346528
* [Hexagon] Place globals with explicit .sdata section in small dataKrzysztof Parzyszek2018-11-091-5/+10
| | | | | | | | Both -fPIC and -G0 disable placement of globals in small data section, but if a global has an explicit section assigmnent placing it in small data, it should go there anyway. llvm-svn: 346523
* [Power9] Allow gpr callee saved spills in prologue to vectors registersZaara Syeda2018-11-092-23/+123
| | | | | | | | | | | | | | Currently in llvm, CalleeSavedInfo can only assign a callee saved register to stack frame index to be spilled in the prologue. We would like to enable spilling gprs to vector registers. This patch adds the capability to spill to other registers aside from just the stack. It also adds the changes for power9 to spill gprs to volatile vector registers when they are available. This happens only for leaf functions when using the option -ppc-enable-pe-vector-spills. Differential Revision: https://reviews.llvm.org/D39386 llvm-svn: 346512
* Revert "[DEBUGINFO, NVPTX]DO not emit ',debug' option if no debug info or ↵Alexey Bataev2018-11-093-30/+4
| | | | | | | | | only debug directives are requested." This reverts commit r345972. Need to update the description + possibly to update the patch itself after discussion with Eric Christofer. llvm-svn: 346508
* [SystemZ] Avoid inserting same value after replicationJonas Paulsson2018-11-091-1/+3
| | | | | | | | | | | A minor improvement of buildVector() that skips creating an INSERT_VECTOR_ELT for a Value which has already been used for the REPLICATE. Review: Ulrich Weigand https://reviews.llvm.org/D54315 llvm-svn: 346504
* [ARM] Don't promote i1 types in ARM CGPSam Parker2018-11-091-1/+3
| | | | | | | | | Now that we have mixed type sizes, i1 values need to be explicitly handled as we want to avoid promoting these values. Differential Revision: https://reviews.llvm.org/D54308 llvm-svn: 346499
* [x86] try to form broadcast before widening shuffle elementsSanjay Patel2018-11-091-0/+8
| | | | | | | | | | | | | | I noticed that we weren't generating broadcasts as much I thought we would with D54271, and this is part of the problem. Widening the shuffle elements means adding bitcasts and hiding the relationship between a splatted scalar and the vector. If we can form a broadcast, do that before going through the rest of the shuffle lowering because broadcasts should be cheap and can often be load-folded. Differential Revision: https://reviews.llvm.org/D54280 llvm-svn: 346498
* [RISCV] Avoid unnecessary XOR for seteq/setne 0Alex Bradbury2018-11-091-0/+2
| | | | | | | | Differential Revision: https://reviews.llvm.org/D53492 Patch by James Clarke. llvm-svn: 346497
* [MIPS GlobalISel] narrowScalar G_CONSTANTPetar Avramovic2018-11-091-23/+1
| | | | | | | | Legalize s64 G_CONSTANT using narrowScalar on MIPS 32. Differential Revision: https://reviews.llvm.org/D54255 llvm-svn: 346495
* [X86] Add Subtarget to more lowerVectorShuffle functions. NFCI.Simon Pilgrim2018-11-091-22/+26
| | | | | | This will be necessary for an update to D54267 llvm-svn: 346490
* [llvm-exegesis][NFC] Add a way to declare the default counter binding for ↵Clement Courbet2018-11-097-6/+62
| | | | | | | | | | | | | | | | unbound CPUs for a target. Summary: This simplifies the code and moves everything to tablegen for consistency. This also prepares the ground for adding issue counters. Reviewers: gchatelet, john.brawn, jsji Subscribers: nemanjai, mgorny, javed.absar, kbarton, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D54297 llvm-svn: 346489
* [X86] Fix VZEROUPPER scheduling info on SNB,HSW,BDW,SXL,SKX.Clement Courbet2018-11-095-12/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: Starting from SNB, VZEROUPPER is handled by the renamer and uses no proc resources. After HSW, it also has zero latency. This fixes PR35606. To reproduce: Uops: llvm-exegesis -mode=uops -opcode-name=VZEROUPPER Latency: echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper' | /tmp/llvm-exegesis -mode=latency -snippets-file=- echo -e '#LLVM-EXEGESIS-DEFREG XMM0 1\n#LLVM-EXEGESIS-DEFREG XMM1 1\nvzeroupper\naddps %xmm0, %xmm1' | /tmp/llvm-exegesis -mode=latency -snippets-file=- Reviewers: RKSimon, craig.topper, andreadb Subscribers: gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D54107 llvm-svn: 346482
* [ARM] Enable mixed types in ARM CGPSam Parker2018-11-091-61/+73
| | | | | | | | | | | | | | | | | | | | | | Previously, during the search, all values had to have the same 'TypeSize', which is equal to number of bits of the integer type of the icmp operand. All values in the tree had to match this size; meaning that, if we searched from i16, we wouldn't accept i8s. A change in type size requires zext and truncs to perform the casts so, to allow mixed narrow types, the handling of these instructions is now slightly different: - we allow casts if their result or operand is <= TypeSize. - zexts are sinks if their result > TypeSize. - truncs are still sinks if their operand == TypeSize. - truncs are still sources if their result == TypeSize. The transformation bails on finding an icmp that operates on data smaller than the current TypeSize. Differential Revision: https://reviews.llvm.org/D54108 llvm-svn: 346480
* [ARM] Small reorganisation in ARMParallelDSPSam Parker2018-11-091-114/+161
| | | | | | | | | | | | | | | | | A few code movement things: - AreSymmetrical is now a method of BinOpChain. - Created a lambda in CreateParallelMACPairs to reduce loop nesting. - A Reduction object now gets pasted in a couple of places instead, including CreateParallelMACPairs so it doesn't need to return a value. I've also added RecordSequentialLoads, which is run before the transformation begins, and caches the interesting loads. This can then be queried later instead of cross checking many load values. Differential Revision: https://reviews.llvm.org/D54254 llvm-svn: 346479
* [COFF, ARM64] Add support for MSVC buffer security checkMandeep Singh Grang2018-11-092-0/+37
| | | | | | | | | | | | Reviewers: rnk, mstorsjo, compnerd, efriedma, TomTan Reviewed By: rnk Subscribers: javed.absar, kristof.beyls, chrib, llvm-commits Differential Revision: https://reviews.llvm.org/D54248 llvm-svn: 346469
* [WebAssembly] Read prefixed opcodes as ULEB128sThomas Lively2018-11-091-11/+21
| | | | | | | | | | | | Summary: Depends on D54126. Reviewers: aheejin, dschuff, aardappel Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54138 llvm-svn: 346465
* [WebAssembly][NFC] Reorder SIMD sectionThomas Lively2018-11-091-283/+270
| | | | | | | | | | | | | | Summary: Reorders the sections in the SIMD tablegen file to roughly match the new opcode ordering. Depends on D54126. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54134 llvm-svn: 346464
* [WebAssembly] Renumber and LEB128-encode SIMD opcodesThomas Lively2018-11-092-139/+118
| | | | | | | | | | Reviewers: aheejin, dschuff, aardappel Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54126 llvm-svn: 346463
* [WebAssembly] Lower select for vectorsThomas Lively2018-11-091-8/+9
| | | | | | | | | | | | Summary: Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D53675 llvm-svn: 346462
* [WebAssembly] Fix LowerEmscriptenEHSjLj when there's only longjmpHeejin Ahn2018-11-081-52/+60
| | | | | | | | | | | | | | | | | | Summary: The pass incorrectly assumed if there's a longjmp declaration in the module, there is also a setjmp function declaration. Fixed it, and now the pass only converts longjmp and does not do any other transformation when there's no setjmp declaration in the module. Fixes PR39562. Reviewers: jgravelle-google, sbc100 Subscribers: dschuff, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54273 llvm-svn: 346445
* [x86] use shuffles for scalar insertion into high elements of a constant vectorSanjay Patel2018-11-081-4/+18
| | | | | | | | | | | | | | | As discussed in D54073, we have a potential regression from more aggressive vector narrowing here, so let's try to avoid that by changing build-vector lowering slightly. Insert-vector-element lowering always does this since there's no "pinsr" for ymm/zmm: // If the vector is wider than 128 bits, extract the 128-bit subvector, insert // into that, and then insert the subvector back into the result. ...but we can sometimes do better for insert-into-constant-vector by using shuffle lowering. Differential Revision: https://reviews.llvm.org/D54271 llvm-svn: 346433
* Revert "[MSP430] Add MC layer"Davide Italiano2018-11-0829-2684/+1092
| | | | | | | | | | | This commit broke the module buildbots. Error: lib/Target/MSP430/MSP430GenAsmMatcher.inc:1027:1: error: redundant namespace 'llvm' [-Wmodules-import-nested-redundant] ^ llvm-svn: 346410
* [SystemZ] Bugfix in shouldCoalesce()Jonas Paulsson2018-11-081-10/+15
| | | | | | | | | | | | | | | | | | | | It was discovered in randomized testing that the SystemZ implementation of shouldCoalesce() could be caused to crash when subreg liveness was enabled. This was because an undef use of the virtual register was copied outside current MBB at the point of shouldCoalesce() being called. For more details, see https://bugs.llvm.org/show_bug.cgi?id=39276. This patch changes the check for MBB locality from livein/liveout checks to do checks for all instructions of both intervals being inside MBB. This avoids the cases with dead defs / undef uses outside MBB, which are not affecting liveness in/out of MBB. The original test case included as a reduced .mir test case. Review: Ulrich Weigand https://reviews.llvm.org/D54197 llvm-svn: 346406
OpenPOWER on IntegriCloud