summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Don't pass ParitySrc array into isAddSubOrSubAddMask. Instead use a ↵Craig Topper2018-06-041-8/+10
| | | | | | | | bool output parameter to get the real piece of info we care about. NFC The ParitySrc array is more of an implementation detail. A single bool to get the final parity is sufficient. llvm-svn: 333935
* [AMDGPU] Small refactoring in the schedulerStanislav Mekhanoshin2018-06-041-18/+3
| | | | | | | | After last changes some code can be simplified. Differential Revision: https://reviews.llvm.org/D47661 llvm-svn: 333934
* [AMDGPU] Factored out common part of GCNRPTracker::reset()Stanislav Mekhanoshin2018-06-042-11/+17
| | | | | | Differential Revision: https://reviews.llvm.org/D47664 llvm-svn: 333931
* [MachO] Add out-of-bounds check to MachOObjectFile.cppSam Clegg2018-06-041-0/+1
| | | | | | | | This is a followup to rL333496. Differential Revision: https://reviews.llvm.org/D47544 llvm-svn: 333929
* [WebAssembly] Fix .td files after rL333900Sam Clegg2018-06-043-33/+33
| | | | | | Differential Revision: https://reviews.llvm.org/D47727 llvm-svn: 333928
* [ValueTracking] Match select abs pattern when there's an sext involvedJohn Brawn2018-06-041-6/+18
| | | | | | | | | | | | | | When checking a select to see if it matches an abs, allow the true/false values to be a sign-extension of the comparison value instead of requiring that they're directly the comparison value, as all the comparison cares about is the sign of the value. This fixes a regression due to r333702, where we were no longer generating ctlz due to isKnownNonNegative failing to match such a pattern. Differential Revision: https://reviews.llvm.org/D47631 llvm-svn: 333927
* [AMDGPU][Waitcnt] Fix handling of flat instrsMark Searles2018-06-042-6/+14
| | | | | | | | On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order. Differential Revision: https://reviews.llvm.org/D46616 llvm-svn: 333926
* [X86] Only accept const SelectionDAG to ↵Simon Pilgrim2018-06-041-2/+2
| | | | | | | | resolveTargetShuffleInputs/getFauxShuffleMask These methods should only be using SelectionDAG for analysis (known/sign bits etc), not node creation. llvm-svn: 333925
* [NVPTX] Delete dead code from the AsmPrinter.Benjamin Kramer2018-06-042-142/+0
| | | | llvm-svn: 333924
* [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.Andrea Di Biagio2018-06-042-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | This patch is the last of a sequence of three patches related to LLVM-dev RFC "MC support for variant scheduling classes". http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html This fixes PR36672. The main goal of this patch is to teach llvm-mca how to solve variant scheduling classes. This patch does that, plus it adds new variant scheduling classes to the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called dependency breaking instructions that are known to generate zero, and that are optimized out in hardware at register renaming stage). Without the BtVer2 change, this patch would not have had any meaningful tests. This patch is effectively the union of two changes: 1) a change that teaches llvm-mca how to resolve variant scheduling classes. 2) a change to the BtVer2 scheduling model that allows us to special-case packed XOR zero-idioms (this partially fixes PR36671). Differential Revision: https://reviews.llvm.org/D47374 llvm-svn: 333909
* [SelectionDAG] Add missing closing parentheses in comments, NFCKrzysztof Parzyszek2018-06-041-6/+6
| | | | llvm-svn: 333907
* AMDGPU: Make various NamedOperands upper caseNicolai Haehnle2018-06-044-43/+43
| | | | | | | | | | | | | | | | Summary: Avoid name clashes with the corresponding bit fields in the instruction encoding. Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f Reviewers: arsenm, rampitec, kzhuravl Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47433 llvm-svn: 333905
* TableGen: Streamline the semantics of NAMENicolai Haehnle2018-06-048-446/+417
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900
* [mips] Restore the availablity of trap for microMIPSSimon Dardis2018-06-041-0/+1
| | | | | | | | Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47584 llvm-svn: 333895
* [AArch64] Audit on rL333634 to fix FP16 Disasm BitPatternsLuke Geeson2018-06-042-2/+2
| | | | llvm-svn: 333879
* [AArch64][SVE] Fix range for DUP immediates (16bit elts)Sander de Smalen2018-06-042-3/+11
| | | | | | | | | | | | | | | For immediates used in DUP instructions that have the range -128 to 127, or a multiple of 256 in the range -32768 to 32512, one could argue that when the result element size is 16bits (.h), the value can be considered both signed and unsigned. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47619 llvm-svn: 333873
* [AArch64][SVE] Asm: Print indexed element 0 as FPR.Sander de Smalen2018-06-045-0/+67
| | | | | | | | | | | | | | | | | | | | Print the first indexed element as a FP register, for example: mov z0.d, z1.d[0] Is now printed as: mov z0.d, d1 Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47571 llvm-svn: 333872
* [AArch64][SVE] Asm: Support for indexed DUP instructions.Sander de Smalen2018-06-044-71/+127
| | | | | | | | | | | | | | | | | | | | Unpredicated copy of indexed SVE element to SVE vector, along with MOV-aliases. For example: dup z0.h, z1.h[0] duplicates the first 16-bit element from z1 to all elements in the result vector z0. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47570 llvm-svn: 333871
* [AArch64][SVE] Asm: Support for FCPY immediate instructions.Sander de Smalen2018-06-042-2/+43
| | | | | | | | | | | | | Predicated copy of floating-point immediate value to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47518 llvm-svn: 333869
* [AArch64][SVE] Asm: Support for CPY immediate instructionsSander de Smalen2018-06-042-0/+62
| | | | | | | | | | | | | Predicated copy of possibly shifted immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47517 llvm-svn: 333868
* [InstCombine] Fix div handlingSerguei Katkov2018-06-041-2/+2
| | | | | | | | | | | | | When we optimize select basing on fact that div by 0 is undef we should not traverse the instruction which are not guaranteed to transfer execution to next instruction. Guard intrinsic is an example. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47576 llvm-svn: 333864
* [Debugify] Don't apply DI before the bitcode writer passVedant Kumar2018-06-041-0/+4
| | | | | | | | | | | | Applying synthetic debug info before the bitcode writer pass has no testing-related purpose. This commit prevents that from happening. It also adds tests which check that IR produced with/without -debugify-each enabled is identical after stripping. This makes it possible to check that individual passes (or full pipelines) are invariant to debug info. llvm-svn: 333861
* [X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked ↵Craig Topper2018-06-032-26/+68
| | | | | | intrinsics and select instructions. llvm-svn: 333857
* [ORC] Add a constructor to create an IRMaterializationUnit from a module andLang Hames2018-06-031-0/+6
| | | | | | | | | | pre-existing SymbolFlags and SymbolToDefinition maps. This constructor is useful when delegating work from an existing IRMaterialiaztionUnit to a new one, as it avoids the cost of re-computing these maps. llvm-svn: 333852
* [InstCombine] improve sub with bool foldsSanjay Patel2018-06-031-13/+14
| | | | | | | | There's a patchwork of existing transforms trying to handle these cases, but as seen in the changed test, we weren't catching them all. llvm-svn: 333845
* Remove SETCCE use from Lanai's backendAmaury Sechet2018-06-032-17/+0
| | | | | | | | | | | | Summary: This creates a small perf regression, but after talking with Jacques Pienaar, he was good with it to get things moving toward removng SETCCE. Reviewers: jpienaar, bryant Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47626 llvm-svn: 333838
* [NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part)Ivan A. Kosarev2018-06-025-7/+202
| | | | | | | | | We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825
* Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)"Ivan A. Kosarev2018-06-025-202/+7
| | | | | | | | The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824
* [MC] Add assembler support for .cg_profile.Michael J. Spencer2018-06-028-0/+125
| | | | | | | | | | | | | | | Object FIle Representation At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like: .cg_profile a, b, 32 .cg_profile freq, a, 11 .cg_profile freq, b, 20 When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight). Differential Revision: https://reviews.llvm.org/D44965 llvm-svn: 333823
* [X86] Add tied source operand to AVX5124FMAPS and AVX5124VNNIW instructions.Craig Topper2018-06-021-20/+33
| | | | | | This doesn't affect the assembly or disassembly, but is more accurate. llvm-svn: 333822
* [X86] Fix warning message for AVX5124FMAPS and AVX5124VNNIW instructions in ↵Craig Topper2018-06-021-2/+2
| | | | | | | | the assembly parser. The caret was positioned on the wrong operand. It's too hard to get right so just put the caret at the beginning of the instruction. llvm-svn: 333821
* [InstCombine] call simplify before trying vector foldsSanjay Patel2018-06-026-76/+58
| | | | | | | | | | | | | | | | | | | | As noted in the review thread for rL333782, we could have made a bug harder to hit if we were simplifying instructions before trying other folds. The shuffle transform in question isn't ever a simplification; it's just a canonicalization. So I've renamed that to make that clearer. This is NFCI at this point, but I've regenerated the test file to show the cosmetic value naming difference of using instcombine's RAUW vs. the builder. Possible follow-ups: 1. Move reassociation folds after simplifies too. 2. Refactor common code; we shouldn't have so much repetition. llvm-svn: 333820
* [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)Ivan A. Kosarev2018-06-025-7/+202
| | | | | | | | | We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819
* [Support] Remove unused raw_ostream::handle whose anchor role was superseded ↵Fangrui Song2018-06-022-4/+1
| | | | | | by anchor() llvm-svn: 333817
* [X86] Add encoding information for the AVX5124FMAPS and AVX5124VNNIW ↵Craig Topper2018-06-023-1/+78
| | | | | | | | | | instructions so they can be assembled and disassembled. These instructions are unusual in that they operate on 4 consecutive registers so supporting them in codegen will be more difficult than normal. Includes an assembler check to warn if the source register is not the first register of a 4 register group. llvm-svn: 333812
* [PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses.Chandler Carruth2018-06-021-44/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I noticed this issue because we didn't put the primary cloned loop into the `NonChildClonedLoops` vector and so never iterated on it. Once I fixed that, it made it clear why I had to do a really complicated and unnecesasry dance when updating the loops to remain in canonical form -- I was unwittingly working around the fact that the primary cloned loop wasn't in the expected list of cloned loops. Doh! Now that we include it in this vector, we don't need to return it and we can consolidate the update logic as we correctly have a single place where it can be handled. I've just added a test for the iteration order aspect as every time I changed the update logic partially or incorrectly here, an existing test failed and caught it so that seems well covered (which is also evidenced by the extensive working around of this missing update). Reviewers: asbirlea, sanjoy Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47647 llvm-svn: 333811
* [DebugInfo] Refactoring DIType::setFlags to DIType::cloneWithFlags, NFCRoman Tereshin2018-06-011-5/+9
| | | | | | | | | | | | | | | | | | and using the latter in DIBuilder::createArtificialType and DIBuilder::createObjectPointerType methods as well as introducing mirroring DISubprogram::cloneWithFlags and DIBuilder::createArtificialSubprogram methods. The primary goal here is to add createArtificialSubprogram to support a pass downstream while keeping the method consistent with the existing ones and making sure we don't encourage changing already created DI-nodes. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D47615 llvm-svn: 333806
* [X86] Do something sensible when an expand load intrinsic is passed a 0 mask.Craig Topper2018-06-011-1/+1
| | | | | | Previously we just returned undef, but really we should be returning the pass thru input. We also need to make sure we preserve the chain output that the original intrinsic node had to maintain connectivity in the DAG. So we should just return the incoming chain as the output chain. llvm-svn: 333804
* Add a debug dump for DbgValueHistoryMapVedant Kumar2018-06-013-0/+37
| | | | | | | | | This makes it easier to inspect the results of DbgValueHistoryCalculator. Differential Revision: https://reviews.llvm.org/D47663 llvm-svn: 333801
* [X86] Add isel patterns to use vexpand with zero masking when the passthru ↵Craig Topper2018-06-011-0/+4
| | | | | | value is a zero vector. llvm-svn: 333800
* Move some function declarations out of WindowsSupport.hZachary Turner2018-06-016-14/+5
| | | | | | | | | | | | | | | | | | The idea behind WindowsSupport.h is that it's in the source directory so that windows.h'isms don't leak out into the larger LLVM project. To that end, any symbol that references a symbol from windows.h must be in this private header, and not in a public header. However, we had some useful utility functions in WindowsSupport.h which have no dependency on the Windows API, but still only make sense on Windows. Those functions should be usable outside of Support since there is no risk of causing a windows.h leak. Although this introduces some preprocessor logic in some header files, It's not too egregious and it's better than the alternative of duplicating a ton of code. Differential Revision: https://reviews.llvm.org/D47662 llvm-svn: 333798
* [ConstantFold] Disallow folding vector geps into bitcastsKarl-Johan Karlsson2018-06-011-1/+5
| | | | | | | | | | | | | | | | | | | Summary: Getelementptr returns a vector of pointers, instead of a single address, when one or more of its arguments is a vector. In such case it is not possible to simplify the expression by inserting a bitcast of operand(0) into the destination type, as it will create a bitcast between different sizes. Reviewers: majnemer, mkuper, mssimpso, spatel Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D46379 llvm-svn: 333783
* [InstCombine] fix vector shuffle transform to replace undef elements (PR37648)Sanjay Patel2018-06-011-0/+16
| | | | | | | | | | | | | | This bug: https://bugs.llvm.org/show_bug.cgi?id=37648 ...was created with the enhancement to this transform with rL332479. The urem test shows the disaster potential: any undef divisor lane makes the whole op undef. The test diffs show that vector demanded elements turns some of the potential, but not all, unused binop operands back into undef already. llvm-svn: 333782
* [mips] Support 64-bit offsets for lb/sb/ld/sd/lld ... instructionsSimon Atanasyan2018-06-011-53/+30
| | | | | | | | | | | | | | | The `MipsAsmParser::loadImmediate` can load immediates of various sizes into a register. Idea of this change is to use `loadImmediate` in the `MipsAsmParser::expandMemInst` method to load offset into a register and then call required load/store instruction. The patch removes separate `expandLoadInst` and `expandStoreInst` methods and does everything in the `expandMemInst` method to escape code duplication. Differential Revision: https://reviews.llvm.org/D47316 llvm-svn: 333774
* [mips] Extend list of relocations supported by the `.reloc` directiveSimon Atanasyan2018-06-013-1/+80
| | | | | | | Supporting GOT and TLS related relocations by the `.reloc` directive is useful for purpose of testing various tools like a linker, for example. llvm-svn: 333773
* [Hexagon] Avoid UB when shifting unsigned integer left by 32Krzysztof Parzyszek2018-06-011-3/+4
| | | | llvm-svn: 333771
* [ThinLTOBitcodeWriter] Emit summaries for regular LTO modulesVlad Tsyrklevich2018-06-011-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Emit summaries for bitcode modules that are only destined for the regular LTO portion of the build so they can participate in summary-based dead stripping. This change reduces the size of a nacl_helper build with cfi-icall enabled by 7%, removing the majority of the overhead due to enabling cfi-icall. The cfi-icall size increase was caused by compiling in lots of unused code and cfi-icall generating jumptable references to unused symbols that could no longer be removed by -Wl,-gc-sections. Increasing the visibility of summary-based dead stripping prevented jumptable entries being created for unused symbols from the regular LTO portion of the build. Reviewers: pcc Reviewed By: pcc Subscribers: dschuff, mehdi_amini, inglorion, eraman, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D47594 llvm-svn: 333768
* [DAG] Avoid checking for consecutive stores in store merge. NFCI.Nirav Dave2018-06-011-319/+340
| | | | llvm-svn: 333766
* [DAG] Simplify Expression. NFC.Nirav Dave2018-06-011-9/+3
| | | | llvm-svn: 333765
* [DAG] Remove untriggerable check. NFCI.Nirav Dave2018-06-011-10/+0
| | | | | | Candidate check precludes this check. llvm-svn: 333764
OpenPOWER on IntegriCloud