summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] refine UB-handling in shuffle-binop transformSanjay Patel2018-06-041-14/+14
| | | | | | | | | | | | | | | | | | As noted in rL333782, we can be both better for optimization and safer with this transform: BinOp (shuffle V1, Mask), C --> shuffle (BinOp V1, NewC), Mask The only potentially unsafe-to-speculate binops are integer div/rem. All other binops are always safe (although I don't see a way to assert that in code here). For opcodes like shifts that can produce poison, it can't matter here because we know the lanes with undef are dropped by the subsequent shuffle. Differential Revision: https://reviews.llvm.org/D47686 llvm-svn: 333962
* Move Analysis/Utils/Local.h back to TransformsDavid Blaikie2018-06-0483-83/+83
| | | | | | | | | | Review feedback from r328165. Split out just the one function from the file that's used by Analysis. (As chandlerc pointed out, the original change only moved the header and not the implementation anyway - which was fine for the one function that was used (since it's a template/inlined in the header) but not in general) llvm-svn: 333954
* [MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.hJessica Paquette2018-06-045-279/+129
| | | | | | | | | | | | | | | | | | | | | This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952
* [X86][ELF][CET] Adding the .note.gnu.property ELF section in X86Alexander Ivchenko2018-06-041-0/+38
| | | | | | | | | | | | | | In preparation for the proposed linker ABI changes (https://github.com/hjl-tools/linux-abi/wiki/linux-abi-draft.pdf, https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-cet.pdf), this patch enables emission of the .note.gnu.property section to ELF object files when building CET-enabled modules. patch by mike.dvoretsky Differential Revision: https://reviews.llvm.org/D47145 llvm-svn: 333951
* [CodeGen] Always update divergence in SelectionDAG::UpdateNodeOperandsScott Linder2018-06-041-0/+2
| | | | | | | | Some overloads failed to update divergence. Differential Revision: https://reviews.llvm.org/D47148 llvm-svn: 333947
* [Support] Add functions that operate on native file handles on Windows.Zachary Turner2018-06-042-30/+75
| | | | | | | | | | | | | | | | | | | | Windows' CRT has a limit of 512 open file descriptors, and fds which are generated by converting a HANDLE via _get_osfhandle count towards this limit as well. Regardless, often you find yourself marshalling back and forth between native HANDLE objects and fds anyway. If we know from the getgo that we're going to need to work directly with the handle, we can cut out the marshalling layer while also not contributing to filling up the CRT's very limited handle table. On Unix these functions just delegate directly to the existing set of functions since an fd *is* the native file type. It would be nice, very long term, if we could convert most uses of fds to file_t. Differential Revision: https://reviews.llvm.org/D47688 llvm-svn: 333945
* [DAGcombine] Teach the combiner about -a = ~a + 1Amaury Sechet2018-06-041-1/+60
| | | | | | | | | | | | Summary: This include variant for add, uaddo and addcarry. usubo and subcarry require the carry to be flipped to preserve semantic, but we chose to do the transform anyway in that case as to push the transform down the carry chain. Reviewers: efriedma, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D46505 llvm-svn: 333943
* Get rid of SETCCEAmaury Sechet2018-06-045-52/+12
| | | | | | | | | | | | Summary: It has been deprecated in favor of SETCCCARRY for a year now and isn't used by any in tree backend. Reviewers: efriedma, craig.topper, dblaikie, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47685 llvm-svn: 333939
* In thin and full LTO + CFI, direct function calls may go through jump tableDmitry Mikulin2018-06-042-45/+97
| | | | | | | | | | entries to reach the target. Since these calls don't require type checks, we can short-circuit them to their real targets, except in cases when they can be pre-empted. Differential Revision: https://reviews.llvm.org/D46326 llvm-svn: 333937
* [X86] Don't pass ParitySrc array into isAddSubOrSubAddMask. Instead use a ↵Craig Topper2018-06-041-8/+10
| | | | | | | | bool output parameter to get the real piece of info we care about. NFC The ParitySrc array is more of an implementation detail. A single bool to get the final parity is sufficient. llvm-svn: 333935
* [AMDGPU] Small refactoring in the schedulerStanislav Mekhanoshin2018-06-041-18/+3
| | | | | | | | After last changes some code can be simplified. Differential Revision: https://reviews.llvm.org/D47661 llvm-svn: 333934
* [AMDGPU] Factored out common part of GCNRPTracker::reset()Stanislav Mekhanoshin2018-06-042-11/+17
| | | | | | Differential Revision: https://reviews.llvm.org/D47664 llvm-svn: 333931
* [MachO] Add out-of-bounds check to MachOObjectFile.cppSam Clegg2018-06-041-0/+1
| | | | | | | | This is a followup to rL333496. Differential Revision: https://reviews.llvm.org/D47544 llvm-svn: 333929
* [WebAssembly] Fix .td files after rL333900Sam Clegg2018-06-043-33/+33
| | | | | | Differential Revision: https://reviews.llvm.org/D47727 llvm-svn: 333928
* [ValueTracking] Match select abs pattern when there's an sext involvedJohn Brawn2018-06-041-6/+18
| | | | | | | | | | | | | | When checking a select to see if it matches an abs, allow the true/false values to be a sign-extension of the comparison value instead of requiring that they're directly the comparison value, as all the comparison cares about is the sign of the value. This fixes a regression due to r333702, where we were no longer generating ctlz due to isKnownNonNegative failing to match such a pattern. Differential Revision: https://reviews.llvm.org/D47631 llvm-svn: 333927
* [AMDGPU][Waitcnt] Fix handling of flat instrsMark Searles2018-06-042-6/+14
| | | | | | | | On GFX9 and earlier, flat memory ops may decrement VMCNT out-of-order as well as LGKMCNT out-of-order. Differential Revision: https://reviews.llvm.org/D46616 llvm-svn: 333926
* [X86] Only accept const SelectionDAG to ↵Simon Pilgrim2018-06-041-2/+2
| | | | | | | | resolveTargetShuffleInputs/getFauxShuffleMask These methods should only be using SelectionDAG for analysis (known/sign bits etc), not node creation. llvm-svn: 333925
* [NVPTX] Delete dead code from the AsmPrinter.Benjamin Kramer2018-06-042-142/+0
| | | | llvm-svn: 333924
* [RFC][patch 3/3] Add support for variant scheduling classes in llvm-mca.Andrea Di Biagio2018-06-042-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | This patch is the last of a sequence of three patches related to LLVM-dev RFC "MC support for variant scheduling classes". http://lists.llvm.org/pipermail/llvm-dev/2018-May/123181.html This fixes PR36672. The main goal of this patch is to teach llvm-mca how to solve variant scheduling classes. This patch does that, plus it adds new variant scheduling classes to the BtVer2 scheduling model to identify so-called zero-idioms (i.e. so-called dependency breaking instructions that are known to generate zero, and that are optimized out in hardware at register renaming stage). Without the BtVer2 change, this patch would not have had any meaningful tests. This patch is effectively the union of two changes: 1) a change that teaches llvm-mca how to resolve variant scheduling classes. 2) a change to the BtVer2 scheduling model that allows us to special-case packed XOR zero-idioms (this partially fixes PR36671). Differential Revision: https://reviews.llvm.org/D47374 llvm-svn: 333909
* [SelectionDAG] Add missing closing parentheses in comments, NFCKrzysztof Parzyszek2018-06-041-6/+6
| | | | llvm-svn: 333907
* AMDGPU: Make various NamedOperands upper caseNicolai Haehnle2018-06-044-43/+43
| | | | | | | | | | | | | | | | Summary: Avoid name clashes with the corresponding bit fields in the instruction encoding. Change-Id: Id1644e703e976e78f7af93788d9f44cb48c3251f Reviewers: arsenm, rampitec, kzhuravl Subscribers: wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D47433 llvm-svn: 333905
* TableGen: Streamline the semantics of NAMENicolai Haehnle2018-06-048-446/+417
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The new rules are straightforward. The main rules to keep in mind are: 1. NAME is an implicit template argument of class and multiclass, and will be substituted by the name of the instantiating def/defm. 2. The name of a def/defm in a multiclass must contain a reference to NAME. If such a reference is not present, it is automatically prepended. And for some additional subtleties, consider these: 3. defm with no name generates a unique name but has no special behavior otherwise. 4. def with no name generates an anonymous record, whose name is unique but undefined. In particular, the name won't contain a reference to NAME. Keeping rules 1&2 in mind should allow a predictable behavior of name resolution that is simple to follow. The old "rules" were rather surprising: sometimes (but not always), NAME would correspond to the name of the toplevel defm. They were also plain bonkers when you pushed them to their limits, as the old version of the TableGen test case shows. Having NAME correspond to the name of the toplevel defm introduces "spooky action at a distance" and breaks composability: refactoring the upper layers of a hierarchy of nested multiclass instantiations can cause unexpected breakage by changing the value of NAME at a lower level of the hierarchy. The new rules don't suffer from this problem. Some existing .td files have to be adjusted because they ended up depending on the details of the old implementation. Change-Id: I694095231565b30f563e6fd0417b41ee01a12589 Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D47430 llvm-svn: 333900
* [mips] Restore the availablity of trap for microMIPSSimon Dardis2018-06-041-0/+1
| | | | | | | | Reviewers: smaksimovic, atanasyan, abeserminji Differential Revision: https://reviews.llvm.org/D47584 llvm-svn: 333895
* [AArch64] Audit on rL333634 to fix FP16 Disasm BitPatternsLuke Geeson2018-06-042-2/+2
| | | | llvm-svn: 333879
* [AArch64][SVE] Fix range for DUP immediates (16bit elts)Sander de Smalen2018-06-042-3/+11
| | | | | | | | | | | | | | | For immediates used in DUP instructions that have the range -128 to 127, or a multiple of 256 in the range -32768 to 32512, one could argue that when the result element size is 16bits (.h), the value can be considered both signed and unsigned. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47619 llvm-svn: 333873
* [AArch64][SVE] Asm: Print indexed element 0 as FPR.Sander de Smalen2018-06-045-0/+67
| | | | | | | | | | | | | | | | | | | | Print the first indexed element as a FP register, for example: mov z0.d, z1.d[0] Is now printed as: mov z0.d, d1 Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47571 llvm-svn: 333872
* [AArch64][SVE] Asm: Support for indexed DUP instructions.Sander de Smalen2018-06-044-71/+127
| | | | | | | | | | | | | | | | | | | | Unpredicated copy of indexed SVE element to SVE vector, along with MOV-aliases. For example: dup z0.h, z1.h[0] duplicates the first 16-bit element from z1 to all elements in the result vector z0. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47570 llvm-svn: 333871
* [AArch64][SVE] Asm: Support for FCPY immediate instructions.Sander de Smalen2018-06-042-2/+43
| | | | | | | | | | | | | Predicated copy of floating-point immediate value to SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47518 llvm-svn: 333869
* [AArch64][SVE] Asm: Support for CPY immediate instructionsSander de Smalen2018-06-042-0/+62
| | | | | | | | | | | | | Predicated copy of possibly shifted immediate value into SVE vector, along with MOV-aliases. Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D47517 llvm-svn: 333868
* [InstCombine] Fix div handlingSerguei Katkov2018-06-041-2/+2
| | | | | | | | | | | | | When we optimize select basing on fact that div by 0 is undef we should not traverse the instruction which are not guaranteed to transfer execution to next instruction. Guard intrinsic is an example. Reviewers: spatel, craig.topper Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47576 llvm-svn: 333864
* [Debugify] Don't apply DI before the bitcode writer passVedant Kumar2018-06-041-0/+4
| | | | | | | | | | | | Applying synthetic debug info before the bitcode writer pass has no testing-related purpose. This commit prevents that from happening. It also adds tests which check that IR produced with/without -debugify-each enabled is identical after stripping. This makes it possible to check that individual passes (or full pipelines) are invariant to debug info. llvm-svn: 333861
* [X86] Remove and autoupgrade masked avx512vnni intrinsics using the unmasked ↵Craig Topper2018-06-032-26/+68
| | | | | | intrinsics and select instructions. llvm-svn: 333857
* [ORC] Add a constructor to create an IRMaterializationUnit from a module andLang Hames2018-06-031-0/+6
| | | | | | | | | | pre-existing SymbolFlags and SymbolToDefinition maps. This constructor is useful when delegating work from an existing IRMaterialiaztionUnit to a new one, as it avoids the cost of re-computing these maps. llvm-svn: 333852
* [InstCombine] improve sub with bool foldsSanjay Patel2018-06-031-13/+14
| | | | | | | | There's a patchwork of existing transforms trying to handle these cases, but as seen in the changed test, we weren't catching them all. llvm-svn: 333845
* Remove SETCCE use from Lanai's backendAmaury Sechet2018-06-032-17/+0
| | | | | | | | | | | | Summary: This creates a small perf regression, but after talking with Jacques Pienaar, he was good with it to get things moving toward removng SETCCE. Reviewers: jpienaar, bryant Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47626 llvm-svn: 333838
* [NEON] Support VLD1xN intrinsics in AArch32 mode (LLVM part)Ivan A. Kosarev2018-06-025-7/+202
| | | | | | | | | We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47120 llvm-svn: 333825
* Revert r333819 "[NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)"Ivan A. Kosarev2018-06-025-202/+7
| | | | | | | | The LLVM part was committed instead of the Clang part. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333824
* [MC] Add assembler support for .cg_profile.Michael J. Spencer2018-06-028-0/+125
| | | | | | | | | | | | | | | Object FIle Representation At codegen time this is emitted into the ELF file a pair of symbol indices and a weight. In assembly it looks like: .cg_profile a, b, 32 .cg_profile freq, a, 11 .cg_profile freq, b, 20 When writing an ELF file these are put into a SHT_LLVM_CALL_GRAPH_PROFILE (0x6fff4c02) section as (uint32_t, uint32_t, uint64_t) tuples as (from symbol index, to symbol index, weight). Differential Revision: https://reviews.llvm.org/D44965 llvm-svn: 333823
* [X86] Add tied source operand to AVX5124FMAPS and AVX5124VNNIW instructions.Craig Topper2018-06-021-20/+33
| | | | | | This doesn't affect the assembly or disassembly, but is more accurate. llvm-svn: 333822
* [X86] Fix warning message for AVX5124FMAPS and AVX5124VNNIW instructions in ↵Craig Topper2018-06-021-2/+2
| | | | | | | | the assembly parser. The caret was positioned on the wrong operand. It's too hard to get right so just put the caret at the beginning of the instruction. llvm-svn: 333821
* [InstCombine] call simplify before trying vector foldsSanjay Patel2018-06-026-76/+58
| | | | | | | | | | | | | | | | | | | | As noted in the review thread for rL333782, we could have made a bug harder to hit if we were simplifying instructions before trying other folds. The shuffle transform in question isn't ever a simplification; it's just a canonicalization. So I've renamed that to make that clearer. This is NFCI at this point, but I've regenerated the test file to show the cosmetic value naming difference of using instcombine's RAUW vs. the builder. Possible follow-ups: 1. Move reassociation folds after simplifies too. 2. Refactor common code; we shouldn't have so much repetition. llvm-svn: 333820
* [NEON] Support VLD1xN intrinsics in AArch32 mode (Clang part)Ivan A. Kosarev2018-06-025-7/+202
| | | | | | | | | We currently support them only in AArch64. The NEON Reference, however, says they are 'ARMv7, ARMv8' intrinsics. Differential Revision: https://reviews.llvm.org/D47121 llvm-svn: 333819
* [Support] Remove unused raw_ostream::handle whose anchor role was superseded ↵Fangrui Song2018-06-022-4/+1
| | | | | | by anchor() llvm-svn: 333817
* [X86] Add encoding information for the AVX5124FMAPS and AVX5124VNNIW ↵Craig Topper2018-06-023-1/+78
| | | | | | | | | | instructions so they can be assembled and disassembled. These instructions are unusual in that they operate on 4 consecutive registers so supporting them in codegen will be more difficult than normal. Includes an assembler check to warn if the source register is not the first register of a 4 register group. llvm-svn: 333812
* [PM/LoopUnswitch] Fix how the cloned loops are handled when updating analyses.Chandler Carruth2018-06-021-44/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I noticed this issue because we didn't put the primary cloned loop into the `NonChildClonedLoops` vector and so never iterated on it. Once I fixed that, it made it clear why I had to do a really complicated and unnecesasry dance when updating the loops to remain in canonical form -- I was unwittingly working around the fact that the primary cloned loop wasn't in the expected list of cloned loops. Doh! Now that we include it in this vector, we don't need to return it and we can consolidate the update logic as we correctly have a single place where it can be handled. I've just added a test for the iteration order aspect as every time I changed the update logic partially or incorrectly here, an existing test failed and caught it so that seems well covered (which is also evidenced by the extensive working around of this missing update). Reviewers: asbirlea, sanjoy Subscribers: mcrosier, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D47647 llvm-svn: 333811
* [DebugInfo] Refactoring DIType::setFlags to DIType::cloneWithFlags, NFCRoman Tereshin2018-06-011-5/+9
| | | | | | | | | | | | | | | | | | and using the latter in DIBuilder::createArtificialType and DIBuilder::createObjectPointerType methods as well as introducing mirroring DISubprogram::cloneWithFlags and DIBuilder::createArtificialSubprogram methods. The primary goal here is to add createArtificialSubprogram to support a pass downstream while keeping the method consistent with the existing ones and making sure we don't encourage changing already created DI-nodes. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D47615 llvm-svn: 333806
* [X86] Do something sensible when an expand load intrinsic is passed a 0 mask.Craig Topper2018-06-011-1/+1
| | | | | | Previously we just returned undef, but really we should be returning the pass thru input. We also need to make sure we preserve the chain output that the original intrinsic node had to maintain connectivity in the DAG. So we should just return the incoming chain as the output chain. llvm-svn: 333804
* Add a debug dump for DbgValueHistoryMapVedant Kumar2018-06-013-0/+37
| | | | | | | | | This makes it easier to inspect the results of DbgValueHistoryCalculator. Differential Revision: https://reviews.llvm.org/D47663 llvm-svn: 333801
* [X86] Add isel patterns to use vexpand with zero masking when the passthru ↵Craig Topper2018-06-011-0/+4
| | | | | | value is a zero vector. llvm-svn: 333800
* Move some function declarations out of WindowsSupport.hZachary Turner2018-06-016-14/+5
| | | | | | | | | | | | | | | | | | The idea behind WindowsSupport.h is that it's in the source directory so that windows.h'isms don't leak out into the larger LLVM project. To that end, any symbol that references a symbol from windows.h must be in this private header, and not in a public header. However, we had some useful utility functions in WindowsSupport.h which have no dependency on the Windows API, but still only make sense on Windows. Those functions should be usable outside of Support since there is no risk of causing a windows.h leak. Although this introduces some preprocessor logic in some header files, It's not too egregious and it's better than the alternative of duplicating a ton of code. Differential Revision: https://reviews.llvm.org/D47662 llvm-svn: 333798
OpenPOWER on IntegriCloud