summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix VS2017 narrowing conversion warning. NFCISimon Pilgrim2017-11-281-1/+1
| | | | llvm-svn: 319240
* [X86] Remove unused variable.Craig Topper2017-11-281-1/+0
| | | | llvm-svn: 319239
* [X86] Remove code from combineUIntToFP that tried to favor UINT_TO_FP if ↵Craig Topper2017-11-281-3/+1
| | | | | | | | legal when zero extending from vXi8/vX816. The UINT_TO_FP is immediately converted to SINT_TO_FP when the node is re-evaluated because we'll detect that the sign bit is zero. llvm-svn: 319234
* [X86] Remove custom lowering for uint_to_fp from vXi8/vXi16.Craig Topper2017-11-281-20/+1
| | | | | | We have a DAG combine that uses a zero extend that should prevent this from ever occurring now. llvm-svn: 319233
* [Hexagon] Use stable sort for HexagonShuffler to remove non-deterministic ↵Mandeep Singh Grang2017-11-281-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ordering Summary: This fixes failures in the following tests uncovered by D39245: LLVM :: CodeGen/Hexagon/args.ll LLVM :: CodeGen/Hexagon/constp-extract.ll LLVM :: CodeGen/Hexagon/expand-condsets-basic.ll LLVM :: CodeGen/Hexagon/gp-rel.ll LLVM :: CodeGen/Hexagon/packetize_cond_inst.ll LLVM :: CodeGen/Hexagon/simple_addend.ll LLVM :: CodeGen/Hexagon/swp-stages4.ll LLVM :: CodeGen/Hexagon/swp-vmult.ll LLVM :: CodeGen/Hexagon/swp-vsum.ll LLVM :: MC/Hexagon/align.s LLVM :: MC/Hexagon/asmMap.s LLVM :: MC/Hexagon/dis-duplex-p0.s LLVM :: MC/Hexagon/double-vector-producer.s LLVM :: MC/Hexagon/inst_select.ll LLVM :: MC/Hexagon/instructions/j.s Reviewers: colinl, kparzysz, adasgupt, slarin Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40227 llvm-svn: 319223
* [PowerPC] Allow tail calls of fastcc functions from C CallingConv functions.Sean Fertile2017-11-281-5/+10
| | | | | | | | Allow fastcc callees to be tail-called from ccc callers. Differential Revision: https://reviews.llvm.org/D40355 llvm-svn: 319218
* [aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make ↵Daniel Sanders2017-11-283-3/+21
| | | | | | | | | | | | | | | | them legal The IRTranslator cannot generate these instructions at the moment so there's no issue with not having implemented ISel for them yet. D40092 will add G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into G_ATOMIC_CMPXCHG with an external success check via the `Lower` action. The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is to import SelectionDAG rules while still supporting targets that prefer to custom lower the original LLVM-IR-like operation. llvm-svn: 319216
* [X86][SSE] Add SSE_HADDSUB/SSE_PABS/SSE_PALIGN OpndItinsSimon Pilgrim2017-11-281-45/+59
| | | | | | | | Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319209
* [X86] In lowerVectorShuffleAsElementInsertion, if were able to find a scalar ↵Craig Topper2017-11-281-1/+1
| | | | | | | | | | i8 or i16 and need to zero extend it, make sure we use a vXi32 type of the full vector width. Previously, this was hardcoded to v4i32, but if the input type is 256 bits we need to use v8i32. Fixes PR35443 llvm-svn: 319208
* [Hexagon] Make sure to zero-extend bytes before building a vectorKrzysztof Parzyszek2017-11-281-10/+12
| | | | llvm-svn: 319204
* [X86][X87] Tag FP_TO_INT_IN_MEM pseudos with hasNoSchedulingInfoSimon Pilgrim2017-11-281-2/+2
| | | | | | We don't need scheduling info for pseudos llvm-svn: 319197
* AMDGPU: Add num spilled s/vgprs to metadataKonstantin Zhuravlyov2017-11-281-0/+2
| | | | | | | | This was requested by tools. Differential Revision: https://reviews.llvm.org/D40321 llvm-svn: 319192
* [CodeGen] Print register names in lowercase in both MIR and debug outputFrancis Visoiu Mistrih2017-11-2840-213/+213
| | | | | | | | | | | As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187
* [WebAssembly] Support bitcasted function addresses with varargs.Dan Gohman2017-11-281-6/+5
| | | | | | | | | | | | | Generalize FixFunctionBitcasts to handle varargs functions. This in particular fixes the case where clang bitcasts away a varargs when calling a K&R-style function. This avoids interacting with tricky ABI details because it operates at the LLVM IR level before varargs ABI details are exposed. This fixes PR35385. llvm-svn: 319186
* [X86][X87] Tag FTST x87 instruction scheduler classSimon Pilgrim2017-11-281-1/+2
| | | | | | Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags. llvm-svn: 319184
* [X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classesSimon Pilgrim2017-11-283-16/+30
| | | | | | | Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175
* Use getStoreSize() in various places instead of 'BitSize >> 3'.Jonas Paulsson2017-11-282-12/+3
| | | | | | | | | | | | | | | | | | This is needed for cases when the memory access is not as big as the width of the data type. For instance, storing i1 (1 bit) would be done in a byte (8 bits). Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a size of 0, which for instance makes alias analysis return NoAlias even when it shouldn't. There are no tests as this was done as a follow-up to the bugfix for the case where this was discovered (r318824). This handles more similar cases. Review: Björn Petterson https://reviews.llvm.org/D40339 llvm-svn: 319173
* [CodeGen] Rename functions PrintReg* to printReg*Francis Visoiu Mistrih2017-11-2828-150/+150
| | | | | | | | | | | LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 llvm-svn: 319168
* [X86][3DNow] Add instruction itinerary and scheduling classes for ↵Simon Pilgrim2017-11-281-6/+8
| | | | | | femms/prefetch/prefetchw llvm-svn: 319167
* AMDGPU: Re-organize the outer loop of SILoadStoreOptimizerNicolai Haehnle2017-11-281-6/+5
| | | | | | | | | | | | | | | | | | | Summary: The entire algorithm operates per basic-block, so for cache locality it should be better to re-optimize a basic-block immediately rather than in a separate loop. I don't have performance measurements. Change-Id: I85106570bd623c4ff277faaa50ee43258e1ddcc5 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D40344 llvm-svn: 319156
* AMDGPU: Consistently check for immediates in SIInstrInfo::FoldImmediateNicolai Haehnle2017-11-281-23/+22
| | | | | | | | | | | | | | | | | | | | | | | Summary: The PeepholeOptimizer pass calls this function solely based on checking DefMI->isMoveImmediate(), which only checks the MoveImm bit of the instruction description. So it's up to FoldImmediate itself to properly check that DefMI *actually* moves from an immediate. I don't have a separate test case for this, but the next patch introduces a test case which happens to crash without this change. This error is caught by the assertion in MachineOperand::getImm(). Change-Id: I88e7cdbcf54d75e1a296822e6fe5f9a5f095bbf8 Reviewers: arsenm, rampitec Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D40342 llvm-svn: 319155
* [WebAssembly] Handle errors better in fast-isel.Dan Gohman2017-11-281-12/+40
| | | | | | | | | Fast-isel routines need to bail out in the case that fast-isel fails on the operands. This fixes https://bugs.llvm.org/show_bug.cgi?id=35064 llvm-svn: 319144
* [X86] Remove some unused pattern fragments from td file. NFCCraig Topper2017-11-281-10/+0
| | | | llvm-svn: 319143
* [X86] Make zero extend from v16i1/v8i1 to v16i8/v8i16/v16i16 not scalarize ↵Craig Topper2017-11-281-0/+4
| | | | | | under AVX512. llvm-svn: 319136
* ARM: Fix PR32578Matthias Braun2017-11-281-1/+1
| | | | | | | | | | https://llvm.org/PR32578 I simplified and converted the reproducer into a lit test. Patch by Vedant Kumar! llvm-svn: 319130
* [WebAssembly] Fix trapping behavior in fptosi/fptoui.Dan Gohman2017-11-288-19/+227
| | | | | | | | | | | | This adds code to protect WebAssembly's `trunc_s` family of opcodes from values outside their domain. Even though such conversions have full undefined behavior in C/C++, LLVM IR's `fptosi` and `fptoui` do not, and only return undef. This also implements the proposed non-trapping float-to-int conversion feature and uses that instead when available. llvm-svn: 319128
* [X86] Remove unnecessary fp<->int setOperationAction lines from a hasVLX ↵Craig Topper2017-11-281-7/+0
| | | | | | | | block. NFCI These lines all exist identically either under SSE2, AVX2 or AVX512. Given that VLX implies all of those, these aren't providing anything new. llvm-svn: 319124
* [X86] Remove duplicate calls to setOperationAction. NFCICraig Topper2017-11-281-2/+0
| | | | | | These same calls exist a few lines down. llvm-svn: 319122
* [X86] Teach getSetCCResultType to handle more than just SimpleVTs when ↵Craig Topper2017-11-271-15/+12
| | | | | | | | looking at larger than 512-bit vectors. Which VTs are considered simple is determined by the superset of the legal types of all targets in LLVM. If we're looking at VTs that are going to be split down to 512-bits we should allow any VT not just simple ones since the simple list changes over time as new targets are added. llvm-svn: 319110
* [X86] Remove lines that set v8f32 FP_ROUND/FP_EXTEND to Legal under AVX512. NFCICraig Topper2017-11-271-2/+0
| | | | | | We don't do this for narrow vectors under AVX or SSE features. We also don't set them to Expand like we do for many vectors op. Nor does TargetLoweringBase.cpp. This leads me to believe these default to Legal. llvm-svn: 319103
* [PartiallyInlineLibCalls][x86] add TTI hook to allow sqrt inlining to depend ↵Sanjay Patel2017-11-272-0/+5
| | | | | | | | | | | on arg rather than result This should fix PR31455: https://bugs.llvm.org/show_bug.cgi?id=31455 Differential Revision: https://reviews.llvm.org/D28314 llvm-svn: 319094
* [PowerPC] Remove redundant TOC savesZaara Syeda2017-11-273-2/+87
| | | | | | | | | | This patch adds a peep hole optimization to remove any redundant toc save instructions added as part of the call sequence for indirect calls. It removes any toc saves within a function that are dominated by another toc save. Differential Revision: https://reviews.llvm.org/D39736 llvm-svn: 319087
* [X86] Remove an unused isel pattern that looked for pshufd with v4f32 type.Craig Topper2017-11-271-12/+0
| | | | | | I don't believe our current lowering/combining would ever produce such a node. We only produce integer typed pshufds. llvm-svn: 319068
* [X86] Teach combineX86ShuffleChain that AllowIntDomain requires at least SSE2.Craig Topper2017-11-271-1/+1
| | | | | | I don't have a good test case for this at the moment. I was playing around with a change in legalizing and triggered this code to produce a PSHUFD with sse1 only. llvm-svn: 319066
* [X86][AVX512] Tag AVX512 PACKSS/PACKUS/PMADDWD/PMADDUBSW instructions with ↵Simon Pilgrim2017-11-272-20/+29
| | | | | | | | SSE_PACK/SSE_PMADD schedule classes llvm-svn: 319065
* [Hexagon] Implement HexagonSubtarget::isHVXVectorTypeKrzysztof Parzyszek2017-11-272-27/+14
| | | | llvm-svn: 319064
* [X86] Make getSetCCResultType return vXi1 for any vXi32/vXi64 vector over ↵Craig Topper2017-11-271-1/+1
| | | | | | | | | | 512 bits long when AVX512 is enabled. Similar for vXi16/vXi8 with BWI. Any vector larger than 512 bits will be split to 512 bits during legalization. But without this we will fold sexts with them before that making it difficult to recover leading to scalarization. llvm-svn: 319059
* [X86][SSE] Fix roundpd instructions to correctly use IIC_SSE_ROUNDPD_* ↵Simon Pilgrim2017-11-271-2/+2
| | | | | | itineraries llvm-svn: 319054
* [AMDGPU][MC][DISASSEMBLER][GFX9] Corrected decoding of GLOBAL/SCRATCH opcodesDmitry Preobrazhensky2017-11-273-6/+6
| | | | | | | | | See bug 35433: https://bugs.llvm.org/show_bug.cgi?id=35433 Differential Revision: https://reviews.llvm.org/D40493 Reviewers: artem.tamazov, SamWot, arsenm llvm-svn: 319050
* [Power9] Improvements to vector extract with variable index exploitationZaara Syeda2017-11-271-22/+174
| | | | | | | | | | This patch extends on to rL307174 to not use the power9 vector extract with variable index instructions when extracting word element 1. For such cases, the existing selection of MFVSRWZ provides a better sequence. Differential Revision: https://reviews.llvm.org/D38287 llvm-svn: 319049
* [X86][AVX512] Tag AVX512 sqrt instructions with SSE_SQRT schedule classesSimon Pilgrim2017-11-271-29/+32
| | | | llvm-svn: 319045
* [DAG] Do MergeConsecutiveStores again before Instruction SelectionNirav Dave2017-11-271-2/+0
| | | | | | | | | | | | | | | | Summary: Now that store-merge is only generates type-safe stores, do a second pass just before instruction selection to allow lowered intrinsics to be merged as well. Reviewers: jyknight, hfinkel, RKSimon, efriedma, rnk, jmolloy Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33675 llvm-svn: 319036
* [X86] Add INVLPGA to the existing INVLPG schedulingSimon Pilgrim2017-11-271-3/+4
| | | | llvm-svn: 319031
* [mips] fix asmstring of Ext and Ins instructions and mips16 JALRC/JRCPetar Jovanovic2017-11-272-5/+5
| | | | | | | | | | | | | | | Make the print format consistent with other assembler instructions. Adding a tab character instead of space in asmstring of Ext and Ins instructions. Removing space around the tab character for JALRC and replacing space with tab in JRC. Patch by Milos Stojanovic. Differential Revision: https://reviews.llvm.org/D38144 llvm-svn: 319030
* [AMDGPU] Add custom lowering for llvm.log{,10}.{f16,f32} intrinsicsVedran Miletic2017-11-272-0/+32
| | | | | | | | | | | | | | AMDGPU backend errors with "unsupported call to function" upon encountering a call to llvm.log{,10}.{f16,f32} intrinsics. This patch adds custom lowering to avoid that error on both R600 and SI. Reviewers: arsenm, jvesely Subscribers: tstellar Differential Revision: https://reviews.llvm.org/D29942 llvm-svn: 319025
* [X86][FMA] Tag all FMA/FMA4 instructions with WriteFMA schedule classSimon Pilgrim2017-11-2710-52/+75
| | | | | | | | | | As mentioned on PR17367, many instructions are missing scheduling tags preventing us from setting 'CompleteModel = 1' for better instruction analysis. This patch deals with FMA/FMA4 which is one of the bigger offenders (along with AVX512 in general). Annoyingly all scheduler models need to define WriteFMA (now that its actually used), even for older targets without FMA/FMA4 support, but that is an existing problem shared by other schedule classes. Differential Revision: https://reviews.llvm.org/D40351 llvm-svn: 319016
* [ARM] Fix an off-by-one error when restoring LR for 16-bit ThumbMomchil Velikov2017-11-271-1/+1
| | | | | | | | | | | | | | The commit https://reviews.llvm.org/rL318143 computes incorrectly to offset to restore LR from. The number of tPOP operands is 2 (condition) + 2 (implicit def and use of SP) + count of the popped registers. We need to load LR from just past the last register, hence the correct offset should be either getNumOperands() - 4 and getNumExplicitOperands() - 2 (multiplied by 4). Differential revision: https://reviews.llvm.org/D40305 llvm-svn: 319014
* Update BTVER2 sched numbers for SSE42 string instructions.Andrew V. Tischenko2017-11-271-24/+30
| | | | | | Differential Revision: https://reviews.llvm.org/D39846 llvm-svn: 319013
* [X86] Fix an assert that was incorrectly checking for BMI instead of AVX512VBMI.Craig Topper2017-11-261-2/+1
| | | | | | The check is actually unnecessary since AVX512VBMI implies AVX512BW which is the other part of the assert. llvm-svn: 319006
* [X86][3DNow] Add 3DNow! instruction itinerary and scheduling classesSimon Pilgrim2017-11-262-37/+84
| | | | llvm-svn: 319005
OpenPOWER on IntegriCloud