summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Use ANY_EXTEND instead of SIGN_EXTEND in the AVX2 and later path for ↵Craig Topper2018-11-161-2/+2
| | | | | | | | | | legalizing vXi8 multiply. We aren't going to use the upper bits of the multiply result that the extend would effect. So we don't need a specific type of extend. This makes some reduction test cases shorter because we were previously trying to sign_extend a truncate which we can't eliminate. llvm-svn: 347011
* [X86] Update a couple comments to remove a mention of a sign extending that ↵Craig Topper2018-11-161-3/+3
| | | | | | no longer happens. NFC llvm-svn: 347010
* [AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/STRon Lieberman2018-11-168-4/+263
| | | | | | | | | Add a pass to fixup various vector ISel issues. Currently we handle converting GLOBAL_{LOAD|STORE}_* and GLOBAL_Atomic_* instructions into their _SADDR variants. This involves feeding the sreg into the saddr field of the new instruction. llvm-svn: 347008
* [WebAssembly] Split BBs after throw instructionsHeejin Ahn2018-11-161-14/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: `throw` instruction is a terminator in wasm, but BBs were not splitted after `throw` instructions, causing machine instruction verifier to fail. This patch - Splits BBs after `throw` instructions in WasmEHPrepare and adding an unreachable instruction after `throw`, which will be deleted in LateEHPrepare pass - Refactors WasmEHPrepare into two member functions - Changes the semantics of `eraseBBsAndChildren` in LateEHPrepare pass to match that of WasmEHPrepare pass, which is newly added. Now `eraseBBsAndChildren` does not delete BBs with remaining predecessors. - Fixes style nits, making static function names conform to clang-tidy - Re-enables the test temporarily disabled by rL346840 && rL346845 Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54571 llvm-svn: 347003
* [AMDGPU] NFC Test commitRon Lieberman2018-11-161-1/+1
| | | | llvm-svn: 347002
* AMDHSA: More code object v3 fixes:Konstantin Zhuravlyov2018-11-151-1/+2
| | | | | | | | - Make sure IsaInfo::hasCodeObjectV3 returns true only for AMDHSA - Update assembler metadata tests to use v2 by default llvm-svn: 347001
* [X86] Remove ANY_EXTEND special case from canReduceVMulWidthCraig Topper2018-11-151-18/+2
| | | | | | | | | | Removing this code doesn't affect any lit tests so it doesn't appear to be tested anymore. I assume it was when it was added, but I guess something else changed? Code coverage report also says its unused. I mostly didn't like that it seemed to count the sign bits as if it was a sign_extend, but then set isPositive as if it was a zero_extend. It feels like we should have picked one interpretation? Differential Revision: https://reviews.llvm.org/D54596 llvm-svn: 346995
* [X86] Minor cleanup to getExtendInVec. NFCICraig Topper2018-11-151-4/+7
| | | | | | | | | | Use unsigned to calculate the subvector index to avoid a cast. Remove an unnecessary condition and replace it with a stronger assert. Use the InVT variable we updated when we extracted instead of grabbing it from the In SDValue. llvm-svn: 346983
* [X86] Add -x86-experimental-vector-widening support to reduceVMULWidth and ↵Craig Topper2018-11-151-14/+22
| | | | | | | | | | | | combineMulToPMADDWD In reduceVMULWidth, we no longer need to worry about extending the vector to 128 bits first. Regular widening of extends, muls and shuffles will take care of that for us. In combineMulToPMADDWD, we can handle v2i32 multiplies and allow the VPMADDWD to be widened to v4i32 during type legalization by adding custom widening like we do have for AVG/ADDUS/SUBUS. I had to modify that code a little to allow different and output VTs. Differential Revision: https://reviews.llvm.org/D54512 llvm-svn: 346980
* [WebAssembly] Fix return type of nextByteThomas Lively2018-11-151-2/+2
| | | | | | | | | | | | | | Summary: The old return type did not allow for correct error reporting and was causing a compiler warning. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54586 llvm-svn: 346979
* [X86] Fix MCNullStreamer support for modules with a CodeView flag Simon Pilgrim2018-11-151-8/+8
| | | | | | | | | | | | This fixes -filetype=null support when compiling for a Win32 target and the module has a CodeView flag. The only places changed are the uses of getTargetStreamer function - this patch guards both of them with null checks. Committed on behalf of @eush (Eugene Sharygin) Differential Revision: https://reviews.llvm.org/D54008 llvm-svn: 346962
* [RISCV] Mark C.EBREAK instruction as having side effectsAlex Bradbury2018-11-151-1/+1
| | | | | | | | | | | | | | | C.EBREAK was defined with hasSideEffects = 0, which is incorrect and inconsistent with the non-compressed instruction form. This patch corrects this oversight. This wouldn't cause codegen issues, as compressed instructions are only ever generated by converting the non-compressed form as an MCInst. But having correct flags is still worthwhile. Differential Revision: https://reviews.llvm.org/D54256 Patch by Luís Marques. llvm-svn: 346959
* [RISCV] Mark FREM as ExpandAlex Bradbury2018-11-151-1/+1
| | | | | | | | | | | | Mark the FREM SelectionDAG node as Expand, which is necessary in order to support the frem IR instruction on RISC-V. This is expanded into a library call. Adds the corresponding test. Previously, this would have triggered an assertion at instruction selection time. Differential Revision: https://reviews.llvm.org/D54159 Patch by Luís Marques. llvm-svn: 346958
* Add missed files from prev. commitAnton Korobeynikov2018-11-1511-0/+1572
| | | | llvm-svn: 346949
* [MSP430] Add MC layerAnton Korobeynikov2018-11-1518-1092/+1113
| | | | | | | | | | | | | | Reapply r346374 with the fixes for modules build. Original summary: This change implements assembler parser, code emitter, ELF object writer and disassembler for the MSP430 ISA. Also, more instruction forms are added to the target description. Patch by Michael Skvortsov! llvm-svn: 346948
* [RISCV] Introduce the RISCVMatInt::generateInstSeq helperAlex Bradbury2018-11-154-72/+132
| | | | | | | | | | | | | | | | | | | | | Logic to load 32-bit and 64-bit immediates is currently present in RISCVAsmParser::emitLoadImm in order to support the li pseudoinstruction. With the introduction of RV64 codegen, there is a greater benefit of sharing immediate materialisation logic between the MC layer and codegen. The generateInstSeq helper allows this by producing a vector of simple structs representing the chosen instructions. This can then be consumed in the MC layer to produce MCInsts or at instruction selection time to produce appropriate SelectionDAG node. Sharing this logic means that both the li pseudoinstruction and codegen can benefit from future optimisations, and that this logic can be used for materialising constants during RV64 codegen. This patch does contain a behaviour change: addi will now be produced on RV64 when no lui is necessary to materialise the constant. In that case addiw takes x0 as the source register, so is semantically identical to addi. Differential Revision: https://reviews.llvm.org/D52961 llvm-svn: 346937
* [X86] Add some custom type legalization rules for truncate with ↵Craig Topper2018-11-151-0/+64
| | | | | | | | -x86-experimental-vector-widening-legalization. This avoids some nasty shuffles when we have avx512. It will also prevent using zmm truncate instructions when a ymm instruction that zeroes part of an xmm register will do. Also avoid using avx512 truncate instructions when the input is 128 bits or less. These instructions are 2 uops on skx so we can probably find a better single uop shuffle like pshufb. llvm-svn: 346936
* [WebAssembly] Renumber SIMD bitwise instructionsThomas Lively2018-11-151-7/+7
| | | | | | | | | | | | Summary: Changed to match https://github.com/WebAssembly/simd/pull/54. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54561 llvm-svn: 346931
* AMDGPU: Enable code object v3 for AMDHSA onlyKonstantin Zhuravlyov2018-11-152-17/+34
| | | | | | Differential Revision: https://reviews.llvm.org/D54186 llvm-svn: 346923
* [X86] Don't mark SEXTLOADS with narrow types as Custom with ↵Craig Topper2018-11-151-8/+27
| | | | | | | | -x86-experimental-vector-widening-legalization. The narrow types end up requesting widening, but generic legalization will end up scalaring and using a build_vector to do the widening. llvm-svn: 346916
* [X86] Remove unused variableBenjamin Kramer2018-11-141-1/+0
| | | | llvm-svn: 346909
* [X86] Support v2i32/v4i16/v8i8 load/store using f64 on 32-bit targets under ↵Craig Topper2018-11-141-15/+38
| | | | | | | | | | -x86-experimental-vector-widening-legalization. On 64-bit targets the type legalizer will use i64 to legalize these. But when i64 isn't legal, the type legalizer won't try an FP type. So do it manually instead. There are a few regressions in here due to some v2i32 operations like mul and div now being reassembled into a full vector just to store instead of storing the pieces. But this was already occuring in 64-bit mode so its not a new issue. llvm-svn: 346908
* [MachineOutliner][NFC] Don't compute liveness if X16/X17/NZCV are unusedJessica Paquette2018-11-141-16/+32
| | | | | | | | | | | Using the MBB flags, we can tell if X16/X17/NZCV are unused in a block, and also not live out. If this holds for all MBBs, then we can avoid checking for liveness on that candidate. Furthermore, if it holds for an individual candidate's MBB, then we can avoid checking for liveness on that candidate. llvm-svn: 346901
* Bias physical register immediate assignmentsNirav Dave2018-11-142-4/+4
| | | | | | | | | | | | | | | | | | | | | | | The machine scheduler currently biases register copies to/from physical registers to be closer to their point of use / def to minimize their live ranges. This change extends this to also physical register assignments from immediate values. This causes a reduction in reduction in overall register pressure and minor reduction in spills and indirectly fixes an out-of-registers assertion (PR39391). Most test changes are from minor instruction reorderings and register name selection changes and direct consequences of that. Reviewers: MatzeB, qcolombet, myatsina, pcc Subscribers: nemanjai, jvesely, nhaehnle, eraman, hiraditya, javed.absar, arphaman, jfb, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D54218 llvm-svn: 346894
* AMDGPU: Additional pattern for i16 median3 matchingAakanksha Patil2018-11-141-4/+17
| | | | | | | | min(max(a, b), max(min(a, b), c)) Differential Revision: https://reviews.llvm.org/D54494 llvm-svn: 346886
* [X86] Allow pmulh to be formed from narrow vXi16 vectors under ↵Craig Topper2018-11-141-2/+4
| | | | | | | | | | -x86-experimental-vector-widening-legalization Narrower vectors will be widened to 128 bits without changing the element size. And generic type legalization can already handle widening mulhu/mulhs. Differential Revision: https://reviews.llvm.org/D54513 llvm-svn: 346879
* [CostModel] Add generic expansion funnel shift cost supportSimon Pilgrim2018-11-141-13/+11
| | | | | | | | Add support for the expansion of funnelshift/rotates to getIntrinsicInstrCost. This also required us to move the X86 fshl/fshr costs to the same place as the rotates to avoid expansion and get correct scalarization vs vectorization costs. llvm-svn: 346854
* [X86][AVX512] Remove constant pool shuffle decoding from SelectionDAGSimon Pilgrim2018-11-141-3/+3
| | | | | | | | | | This patch removes the last use of the constant pool shuffle decode helper and consistently uses the 'getTargetShuffleMaskIndices' versions instead. The constant pool versions are now purely used for assembly comments. The avx512vbmi intrinsic upgrades had to be altered as they were being decoded as broadcasts, similar to what I fixed in rL346032. I don't think the change is critical - although its annoying that we lose the {k}{z} instruction test coverage as they are tricky to generate.... Differential Revision: https://reviews.llvm.org/D54083 llvm-svn: 346850
* [WebAssembly] Add support for the event sectionHeejin Ahn2018-11-1413-31/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This adds support for the 'event section' specified in the exception handling proposal. (This was named 'exception section' first, but later renamed to 'event section' to take possibilities of other kinds of events into consideration. But currently we only store exception info in this section.) The event section is added between the global section and the export section. This is for ease of validation per request of the V8 team. This patch: - Creates the event symbol type, which is a weak symbol - Makes 'throw' instruction take the event symbol '__cpp_exception' - Adds relocation support for events - Adds WasmObjectWriter / WasmObjectFile (Reader) support - Adds obj2yaml / yaml2obj support - Adds '.eventtype' printing support Reviewers: dschuff, sbc100, aardappel Subscribers: jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54096 llvm-svn: 346825
* [PowerPC] Enhance the selection(ISD::VSELECT) of vector typeZi Xuan Wu2018-11-144-15/+35
| | | | | | | | | | To make ISD::VSELECT available(legal) so long as there are altivec instruction, otherwise it's default behavior is expanding, which is legalized at type-legalization phase. Use xxsel to match vselect if vsx is open, or use vsel. Differential Revision: https://reviews.llvm.org/D49531 llvm-svn: 346824
* [MachineOutliner][NFC] Use flags set in all candidates to check for callsJessica Paquette2018-11-131-6/+11
| | | | | | | | | | | If we keep track of if the ContainsCalls bit is set in the MBB flags for each candidate, then we have a better chance of not checking the candidate for calls at all. This saves quite a few checks in some CTMark tests (~200 in Bullet, for example.) llvm-svn: 346816
* [MachineOutliner][NFC] Use MBB flags to avoid call checks in getOutliningInfoJessica Paquette2018-11-131-21/+25
| | | | | | | | | | | | | We already determine a bunch of information about an MBB in getMachineOutlinerMBBFlags. We can reuse that information to avoid calculating things that must be false/true. The first thing we can easily check is if an outlined sequence could ever contain calls. There's no reason to walk over the outlined range, checking for calls, if we already know that there are no calls in the block containing the sequence. llvm-svn: 346809
* [MachineOutliner][NFC] Exit getOutliningType if there are < 2 candidatesJessica Paquette2018-11-131-2/+2
| | | | | | | Since we never outline anything with fewer than 2 occurrences, there's no reason to compute cost model information if there's less than that. llvm-svn: 346803
* [AMDGPU] combine extractelement into several selectsStanislav Mekhanoshin2018-11-131-4/+26
| | | | | | | | | | An extractelement with non-constant index will be lowered either to scratch or movrel loop in most cases. This patch converts such instruction into a set of selects if vector size is not too big. Differential Revision: https://reviews.llvm.org/D54351 llvm-svn: 346800
* [SelectionDAG][X86] Relax restriction on the width of an input to ↵Craig Topper2018-11-135-210/+245
| | | | | | | | | | | | | | | | | | *_EXTEND_VECTOR_INREG. Use them and regular *_EXTEND to replace the X86 specific VSEXT/VZEXT opcodes Previously, the extend_vector_inreg opcode required their input register to be the same total width as their output. But this doesn't match up with how the X86 instructions are defined. For X86 the input just needs to be a legal type with at least enough elements to cover the output. This patch weakens the check on these nodes and allows them to be used as long as they have more input elements than output elements. I haven't changed type legalization behavior so it will still create them with matching input and output sizes. X86 will custom legalize these nodes by shrinking the input to be a 128 bit vector and once we've done that we treat them as legal operations. We still have one case during type legalization where we must custom handle v64i8 on avx512f targets without avx512bw where v64i8 isn't a legal type. In this case we will custom type legalize to a *extend_vector_inreg with a v16i8 input. After that the input is a legal type so type legalization should ignore the node and doesn't need to know about the relaxed restriction. We are no longer allowed to use the default expansion for these nodes during vector op legalization since the default expansion uses a shuffle which required the widths to match. Custom legalization for all types will prevent us from reaching the default expansion code. I believe DAG combine works correctly with the released restriction because it doesn't check the number of input elements. The rest of the patch is changing X86 to use either the vector_inreg nodes or the regular zero_extend/sign_extend nodes. I had to add additional isel patterns to handle any_extend during isel since simplifydemandedbits can create them at any time so we can't legalize to zero_extend before isel. We don't yet create any_extend_vector_inreg in simplifydemandedbits. Differential Revision: https://reviews.llvm.org/D54346 llvm-svn: 346784
* [WebAssembly] Fix broken assumption that all bitcasts are to functions typesSam Clegg2018-11-131-26/+43
| | | | | | | | | | Specifically, we can bitcast to void. Fixes PR39591 Differential Revision: https://reviews.llvm.org/D54447 llvm-svn: 346778
* [CostModel][X86] Fix constant vector XOP rights shiftsSimon Pilgrim2018-11-131-2/+11
| | | | | | | | We'll constant fold these cases so they are as cheap as vector left shift cases. Noticed while improving funnel shift costs. llvm-svn: 346760
* Fix comment for XOP rotates. NFCI.Simon Pilgrim2018-11-131-1/+1
| | | | llvm-svn: 346753
* Fix modules build of AVRAsmParser.cppAlexander Richardson2018-11-131-3/+4
| | | | | | | | | | | | | | | | | Summary: Without this change I get the following error: lib/Target/AVR/AVRGenAsmMatcher.inc:1135:1: error: redundant #include of module 'LLVM_Utils.Support.Format' appears within namespace 'llvm' [-Wmodules-import-nested-redundant] Reviewers: dylanmckay Reviewed By: dylanmckay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D53425 llvm-svn: 346750
* [SystemZ] Increase the number of VLREPsJonas Paulsson2018-11-132-0/+43
| | | | | | | | | | | | | | | If a loaded value is replicated it is best to combine these two operations into a VLREP (load and replicate), but isel will not produce this if the load has other users as well. This patch handles this by putting the other users of the load to use the REPLICATE 0-element instead of the load. This way the load has only the REPLICATE node as user, and we get a VLREP. Review: Ulrich Weigand https://reviews.llvm.org/D54264 llvm-svn: 346746
* [MachineOutliner][NFC] Simplify isMBBSafeToOutlineFrom check in AArch64 outlinerJessica Paquette2018-11-131-20/+19
| | | | | | | | | Turns out it's way simpler to do this check with one LRU. Instead of maintaining two, just keep one. Check if each of the registers is available, and then check if it's a live out from the block. If it's a live out, but available in the block, we know we're in an unsafe case. llvm-svn: 346721
* [MachineOutliner][NFC] Change getMachineOutlinerMBBFlags to ↵Jessica Paquette2018-11-122-17/+33
| | | | | | | | | | | | isMBBSafeToOutlineFrom Instead of returning Flags, return true if the MBB is safe to outline from. This lets us check for unsafe situations, like say, in AArch64, X17 is live across a MBB without being defined in that MBB. In that case, there's no point in performing an instruction mapping. llvm-svn: 346718
* [X86][SSE] Add lowerVectorShuffleAsByteRotateAndPermute (PR39387)Simon Pilgrim2018-11-121-8/+115
| | | | | | | | This patch adds the ability to use a PALIGNR to rotate a pair of inputs to select a range containing all the referenced elements, followed by a single input permute to put them in the right location. Differential Revision: https://reviews.llvm.org/D54267 llvm-svn: 346706
* AMDGPU: Adding more median3 patternsAakanksha Patil2018-11-122-9/+22
| | | | | | | | min(max(a, b), max(min(a, b), c)) -> med3 a, b, c Differential Revision: https://reviews.llvm.org/D54331 llvm-svn: 346704
* [WebAssembly] Added WasmAsmParser.Wouter van Oortmerssen2018-11-121-74/+31
| | | | | | | | | | | | | | | | | | | | | | | Summary: This is to replace the ELFAsmParser that WebAssembly was using, which so far was a stub that didn't do anything, and couldn't work correctly with wasm. This new class is there to implement generic directives related to wasm as a binary format. Wasm target specific directives are still parsed in WebAssemblyAsmParser as before. The two classes now cooperate more correctly too. Also implemented .result which was missing. Any unknown directives will now result in errors. Reviewers: dschuff, sbc100 Subscribers: mgorny, jgravelle-google, eraman, aheejin, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54360 llvm-svn: 346700
* [X86] In LowerMULH, use generic truncate and vector shuffle nodes instead of ↵Craig Topper2018-11-121-13/+18
| | | | | | | | | | directly emitting PACKUS. Truncate and shuffle lowering are already capable of matching to PACKUS using known bits analysis. This features one test change where we now prefer to extend v16i16->v16i32 then trunc v16i32->v16i8 over extract_subvector+packus when avx512f is available, but avx512bw is not. llvm-svn: 346697
* [AMDGPU] Optimize S_CBRANCH_VCC[N]Z -> S_CBRANCH_EXEC[N]ZStanislav Mekhanoshin2018-11-121-0/+97
| | | | | | | | | | | | | | | | | | | Sometimes after basic block placement we end up with a code like: sreg = s_mov_b64 -1 vcc = s_and_b64 exec, sreg s_cbranch_vccz This happens as a join of a block assigning -1 to a saved mask and another block which consumes that saved mask with s_and_b64 and a branch. This is essentially a single s_cbranch_execz instruction when moved into a single new basic block. Differential Revision: https://reviews.llvm.org/D54164 llvm-svn: 346690
* [CostModel][X86] Add funnel shift rotation special case costsSimon Pilgrim2018-11-121-1/+82
| | | | | | When we repeat the 2 shifting operands then this is a bit rotation - annoyingly this has to be done in the other getIntrinsicInstrCost than most intrinsics as we need to check the operands are the same. llvm-svn: 346688
* [CostModel][X86] Add SHLD/SHRD scalar funnel shift costsSimon Pilgrim2018-11-121-2/+11
| | | | | | The costs match the typical reg-reg cases - the RMW case can be a lot slower but we don't model that at this level llvm-svn: 346683
* [CostModel][X86] SK_ExtractSubvector is cheap if the (legal) subvector is ↵Simon Pilgrim2018-11-121-5/+13
| | | | | | aligned within the source vector llvm-svn: 346664
OpenPOWER on IntegriCloud