summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [DAGCombiner] allow undef elts in vector fabs/fneg matchingSanjay Patel2018-10-071-1/+1
| | | | | | | | This change is proposed as a part of D44548, but we need this independently to avoid regressions from improved undef propagation in SimplifyDemandedVectorElts(). llvm-svn: 343940
* [DAGCombiner] shorten code for bitcast+fabs fold; NFCSanjay Patel2018-10-071-5/+2
| | | | llvm-svn: 343939
* [SelectionDAG] Respect multiple uses in SimplifyDemandedBits to ↵Simon Pilgrim2018-10-071-1/+1
| | | | | | | | | | | | SimplifyDemandedVectorElts simplification rL343913 was using SimplifyDemandedBits's original demanded mask instead of the adjusted 'NewMask' that accounts for multiple uses of the op (those variable names really need improving....). Annoyingly many of the test changes (back to pre-rL343913 state) are actually safe - but only because their multiple uses are all by PMULDQ/PMULUDQ. Thanks to Jan Vesely (@jvesely) for bisecting the bug. llvm-svn: 343935
* [LegalizeVectorOps] Make ExpandStrictFPOp return the result corresponding to ↵Craig Topper2018-10-071-1/+1
| | | | | | | | the result number of the SDValue passed in. It was always returning the chain which seems to be the result number of the SDValue in the lit tests we have. But I don't know if that's guaranteed. llvm-svn: 343933
* [IAI,LV] Avoid creating interleave-groups for predicated accesseDorit Nuzman2018-10-071-1/+3
| | | | | | | | | | | | | | | | | | | | | | This patch fixes PR39099. When strided loads are predicated, each of them will form an interleaved-group (with gaps). However, subsequent stages of vectorization (planning and transformation) assume that if a load is part of an Interleave-Group it is not predicated, resulting in wrong code - unmasked wide loads are created. The Interleaving Analysis does take care not to have conditional interleave groups of size > 1, but until we extend the planning and transformation stages to support masked-interleave-groups we should also avoid having them for size == 1. Reviewers: Ayal, hsaito, dcaballe, fhahn Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D52682 llvm-svn: 343931
* [ORC] Add a 'remove' method to JITDylib to remove symbols.Lang Hames2018-10-061-0/+68
| | | | | | | | Symbols can be removed provided that all are present in the JITDylib and none are currently in the materializing state. On success all requested symbols are removed. On failure an error is returned and no symbols are removed. llvm-svn: 343928
* [ORC] Pass symbol name to discard by const reference.Lang Hames2018-10-065-7/+8
| | | | | | This saves some unnecessary atomic ref-counting operations. llvm-svn: 343927
* [X86] getFauxShuffleMask - Handle undef + sentinel values in subvector insertionSimon Pilgrim2018-10-061-1/+3
| | | | llvm-svn: 343926
* [X86][AVX] Ensure resolveTargetShuffleInputs shuffle masks are the correct widthSimon Pilgrim2018-10-061-1/+2
| | | | | | Don't handle ZERO_EXTEND style shuffles until we support bitcasts. Found by inspection. llvm-svn: 343924
* [X86] combinePMULDQ - add op back to worklist if SimplifyDemandedBits ↵Simon Pilgrim2018-10-061-2/+6
| | | | | | | | succeeds on either operand Prevents missing other simplifications that may occur deep in the operand chain where CommitTargetLoweringOpt won't add the PMULDQ back to the worklist itself llvm-svn: 343922
* [X86][SSE] SimplifyDemandedVectorEltsForTargetNode - simplify PSHUFB masksSimon Pilgrim2018-10-061-0/+9
| | | | | | Attempt to simplify PSHUFB masks (even non-constant ones) - we should probably be able to simplify other variable shuffles as well as the need arises. llvm-svn: 343919
* [X86] Use the SimplifyDemandedBits wrappers where possible. NFCI.Simon Pilgrim2018-10-061-23/+4
| | | | | | Leave the wrapper to handle TargetLowering::TargetLoweringOpt and CommitTargetLoweringOpt. llvm-svn: 343918
* [SelectionDAG] Add SimplifyDemandedBits to SimplifyDemandedVectorElts ↵Simon Pilgrim2018-10-061-10/+46
| | | | | | | | | | | | | | | | simplification This patch enables SimplifyDemandedBits to call SimplifyDemandedVectorElts in cases where the demanded bits mask covers entire elements of a bitcasted source vector. There are a couple of cases here where simplification at a deeper level (such as through bitcasts) prevents further simplification - CommitTargetLoweringOpt only adds immediate uses/users back to the worklist when we might want to combine the original caller again to see what else it can simplify. As well as that I had to disable handling of bool vector until SimplifyDemandedVectorElts better supports some of their opcodes (SETCC, shifts etc.). Fixes PR39178 Differential Revision: https://reviews.llvm.org/D52935 llvm-svn: 343913
* [RISCV] Compress addiw rd, x0, simm6 to c.li rd, simm6Alex Bradbury2018-10-061-0/+2
| | | | | | | | A pattern was present for addi rd, x0, simm6 but not addiw which is semantically identical when the source register is x0. This patch addresses that, and the benefit can be seen in rv64c-aliases-valid.s. llvm-svn: 343911
* AMDGPU: Consolidate SMRD TableGen patternsTom Stellard2018-10-061-100/+80
| | | | | | | | | | | | | | | | | Summary: Merge the SMRD patterns for CI into the same multiclass as the patterns for other sub-targets. This removes some duplicate code and will make it easier for some future GlobalISel changes I would like to do. Reviewers: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52557 llvm-svn: 343909
* [New PM][PassTiming] implement -time-passes for the new pass managerFedor Sergeev2018-10-052-1/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Enable time-passes functionality through PassInstrumentation callbacks for passes and analyses. TimePassesHandler class keeps all the callbacks, the timing data as it is being collected as well as the stack of currently active timers. Parts of the fix that might be somewhat unobvious: - mapping of passes into Timer (TimingData) can not be done per-instance. PassID name provided into the callback is common for all the pass invocations. Thus the only way to get a timing with reasonable granularity is to collect timing data per pass invocation, getting a new timer for each BeforePass. Hence the key for TimingData uses a pair of <StringRef/unsigned count> to uniquely identify a pass invocation. - consequently, this new-pass-manager implementation performs no aggregation of timing data, reporting timings for each pass invocation separately. In that it differs from legacy-pass-manager time-passes implementation that reports timing data aggregated per pass instance. - pass managers and adaptors are not tracked, similar to how pass managers are not tracked in legacy time-passes. - TimerStack tracks timers that are active, each BeforePass pushes the new timer on stack, each AfterPass pops active timer from stack and stops it. Reviewers: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D51276 llvm-svn: 343898
* [AArch64] -mcpu=native CPU detection for Cavium processorsJoel Jones2018-10-051-0/+15
| | | | | | | | | | This small patch updates the CPU detection for Cavium processors when -mcpu=native is passed on compile-line. Patch by Stefan Teleman Differential Revision: https://reviews.llvm.org/D51939 llvm-svn: 343897
* X86, AArch64, ARM: Do not attach debug location to spill/reload instructionsMatthias Braun2018-10-053-27/+19
| | | | | | | | | | | | | | This rebases and recommits r343520. hwasan should be fixed now and this shouldn't break the tests anymore. Spill/reload instructions are artificially generated by the compiler and have no relation to the original source code. So the best thing to do is not attach any debug location to them (instead of just taking the next debug location we find on following instructions). Differential Revision: https://reviews.llvm.org/D52125 llvm-svn: 343895
* [X86][AVX] Limit getFauxShuffleMask INSERT_SUBVECTOR support to 2 inputsSimon Pilgrim2018-10-051-1/+2
| | | | | | | | rL343853 didn't limit the number of subinputs, but we don't currently support faux shuffles with more than 2 total inputs, so put a limiter in place until this is fixed. Found by Artem Dergachev. llvm-svn: 343891
* [LiveDebugValues] Extend var ranges through artificial blocksVedant Kumar2018-10-051-12/+32
| | | | | | | | | | | | | | | | | | | | | | | | | ASan often introduces basic blocks consisting exclusively of instructions without debug locations, or with line 0 debug locations. LiveDebugValues needs to extend variable ranges through these artificial blocks. Otherwise, a lot of variables disappear -- even at -O0. Typically, LiveDebugValues does not extend a variable's range into a block unless the block is essentially "part of" the variable's scope (for a precise definition, see LexicalScopes::dominates). This patch relaxes the lexical dominance check for artificial blocks. This makes the following Swift program debuggable at -O0: ``` 1| var x = 100 2| print("x = \(x)") ``` rdar://39127144 Differential Revision: https://reviews.llvm.org/D52921 llvm-svn: 343890
* Clarify debug output in LiveDebugValuesVedant Kumar2018-10-051-7/+27
| | | | | | | MachineBasicBlocks often do not have names, so it helps to refer to them by block number when printing debug messages. llvm-svn: 343889
* [GlobalIsel] Add llvm.invariant.start and llvm.invariant.endJessica Paquette2018-10-051-0/+8
| | | | | | | | | | | | Port over the implementation in SelectionDAGBuilder.cpp into the IRTranslator and update the arm64-irtranslator test. These were causing fallbacks in CTMark/Bullet (-Rpass-missed=gisel-select), and this patch fixes that. https://reviews.llvm.org/D52945 llvm-svn: 343885
* dwarfdump: Avoid parsing units unnecessarilyDavid Blaikie2018-10-051-15/+14
| | | | | | | | | NFC-ish (the parsing of the units is not a functional change - no errors/warnings are emitted during the shallow parsing - though without parsing them here, the "max version" would be wrong (still zero) later on, so in those cases the units do need to be parsed) llvm-svn: 343884
* [DebugInfo] Add support for DWARF5 call site-related attributesVedant Kumar2018-10-056-3/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DWARF v5 introduces DW_AT_call_all_calls, a subprogram attribute which indicates that all calls (both regular and tail) within the subprogram have call site entries. The information within these call site entries can be used by a debugger to populate backtraces with synthetic tail call frames. Tail calling frames go missing in backtraces because the frame of the caller is reused by the callee. Call site entries allow a debugger to reconstruct a sequence of (tail) calls which led from one function to another. This improves backtrace quality. There are limitations: tail recursion isn't handled, variables within synthetic frames may not survive to be inspected, etc. This approach is not novel, see: https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jelinek.pdf This patch adds an IR-level flag (DIFlagAllCallsDescribed) which lowers to DW_AT_call_all_calls. It adds the minimal amount of DWARF generation support needed to emit standards-compliant call site entries. For easier deployment, when the debugger tuning is LLDB, the DWARF requirement is adjusted to v4. Testing: Apart from check-{llvm, clang}, I built a stage2 RelWithDebInfo clang binary. Its dSYM passed verification and grew by 1.4% compared to the baseline. 151,879 call site entries were added. rdar://42001377 Differential Revision: https://reviews.llvm.org/D49887 llvm-svn: 343883
* DwarfDebug: Pick next location in case of missing location at block beginMatthias Braun2018-10-052-41/+74
| | | | | | | | | | | | | | | | | Context: Compiler generated instructions do not have a debug location assigned to them. However emitting 0-line records for all of them bloats the line tables for very little benefit so we usually avoid doing that. Not emitting anything will lead to the previous debug location getting applied to the locationless instructions. This is not desirable for block begin and after labels. Previously we would emit simply emit line-0 records in this case, this patch changes the behavior to do a forward search for a debug location in these cases before emitting a line-0 record to further reduce line table bloat. Inspired by the discussion in https://reviews.llvm.org/D52862 llvm-svn: 343874
* [X86] Don't promote i16 compares to i32 if the immediate will fit in 8 bits.Craig Topper2018-10-051-2/+5
| | | | | | | | The comments in this code say we were trying to avoid 16-bit immediates, but if the immediate fits in 8-bits this isn't an issue. This avoids creating a zero extend that probably won't go away. The movmskb related changes are interesting. The movmskb instruction writes a 32-bit result, but fills the upper bits with 0. So the zero_extend we were previously emitting was free, but we turned a -1 immediate that would fit in 8-bits into a 32-bit immediate so it was still bad. llvm-svn: 343871
* [X86] Move ReadAfterLd functionality into X86FoldableSchedWrite (PR36957)Simon Pilgrim2018-10-0520-477/+534
| | | | | | | | | | | | Currently we hardcode instructions with ReadAfterLd if the register operands don't need to be available until the folded load has completed. This doesn't take into account the different load latencies of different memory operands (PR36957). This patch adds a ReadAfterFold def into X86FoldableSchedWrite to replace ReadAfterLd, allowing us to specify the load latency at a scheduler class level. I've added ReadAfterVec*Ld classes that match the XMM/Scl, XMM and YMM/ZMM WriteVecLoad classes that we currently use, we can tweak these values in future patches once this infrastructure is in place. Differential Revision: https://reviews.llvm.org/D52886 llvm-svn: 343868
* [SelectionDAG] allow undefs when matching splat constantsSanjay Patel2018-10-052-11/+9
| | | | | | | | | And use that to transform fsub with zero constant operands. The integer part isn't used yet, but it is proposed for use in D44548, so adding both enhancements here makes that patch simpler. llvm-svn: 343865
* [X86][AVX] getFauxShuffleMask - add support for INSERT_SUBVECTOR subvector ↵Simon Pilgrim2018-10-051-0/+36
| | | | | | | | | | shuffles Decode subvector shuffles from INSERT_SUBVECTOR(SRC0, SHUFFLE(EXTRACT_SUBVECTOR(SRC1)) This was found necessary while investigating PR39161 llvm-svn: 343853
* [LoopVectorizer] Use TTI.getOperandInfo()Jonas Paulsson2018-10-052-71/+51
| | | | | | | | | | | | | | | | Call getOperandInfo() instead of using (near) duplicated code in LoopVectorizationCostModel::getInstructionCost(). This gets the OperandValueKind and OperandValueProperties values for a Value passed as operand to an arithmetic instruction. getOperandInfo() used to be a static method in TargetTransformInfo.cpp, but is now instead a public member. Review: Florian Hahn https://reviews.llvm.org/D52883 llvm-svn: 343852
* [TargetRegisterInfo] Remove temporary hook enableMultipleCopyHints()Jonas Paulsson2018-10-0512-51/+5
| | | | | | | | | | | | Finally all targets are enabling multiple regalloc hints, so the hook to disable this can now be removed. NFC. Review: Simon Pilgrim https://reviews.llvm.org/D52316 llvm-svn: 343851
* Add missing period to comment to match style of file.Neil Henning2018-10-051-1/+1
| | | | | | This is a test commit to show that my commit access is working. llvm-svn: 343842
* AMDGPU/GlobalISel: Add support for G_INTTOPTRTom Stellard2018-10-053-0/+6
| | | | | | | | | | | | | | Summary: This is a no-op. Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52916 llvm-svn: 343839
* [WebAssembly] Saturating arithmetic intrinsicsThomas Lively2018-10-053-0/+45
| | | | | | | | | | | | Summary: Depends on D52805. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52813 llvm-svn: 343833
* [WebAssembly] Fixed missing "global" symbol type in AsmParser.Wouter van Oortmerssen2018-10-041-0/+1
| | | | | | | | | | | | | | | Summary: These are emitted by the wasm backend for e.g. __stack_pointer@GLOBAL which previously wasn't accepted by the assembler. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, llvm-commits, sunfish Differential Revision: https://reviews.llvm.org/D52911 llvm-svn: 343830
* [globalisel][combine] When placing truncates, handle the case when the BB is ↵Daniel Sanders2018-10-041-14/+28
| | | | | | | | | empty GlobalISel uses MIR with implicit fallthrough on each basic block. As a result, getFirstNonPhi() can return end(). llvm-svn: 343829
* [SimplifyCFG] Pass AggressiveInsts to DominatesMergePoint by reference. ↵Craig Topper2018-10-041-11/+6
| | | | | | | | | | | | | | | | | | | Remove null check. Summary: At some point in the past the recursion in DominatesMergePoint used to pass null for AggressiveInsts as part of the recursion. It no longer does this. So there is no way for AggressiveInsts to be null. This passes it by reference and removes the null check to make this explicit. Reviewers: efriedma, reames Reviewed By: efriedma Subscribers: xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D52575 llvm-svn: 343828
* [WebAssembly] Ignore DBG_VALUE in WebAssemblyCFGStackify pass when looking ↵Yury Delendik2018-10-041-0/+4
| | | | | | | | | | | | | | | | | | | for block start Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=39158 and regression caused by D49034. Though it is possible the problem was existed before and was exposed by additional DBG_VALUEs. Reviewers: sunfish, dschuff, aheejin Reviewed By: aheejin Subscribers: sbc100, aheejin, llvm-commits, alexcrichton, jgravelle-google Differential Revision: https://reviews.llvm.org/D52837 llvm-svn: 343827
* [RISCV] Support named operands for CSR instructions.Ana Pazos2018-10-0417-88/+637
| | | | | | | | | | | | Reviewers: asb, mgrang Reviewed By: asb Subscribers: jocewei, mgorny, jfb, PkmX, MartinMosbeck, brucehoult, the_o, rkruppe, rogfer01, rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones Differential Revision: https://reviews.llvm.org/D46759 llvm-svn: 343822
* [globalisel][combine] Fix a rare crash when encountering an instruction ↵Daniel Sanders2018-10-041-7/+5
| | | | | | | | | | | | whose op0 isn't a reg The simplest instance of this is an intrinsic with no results which will have the intrinsic ID as operand 0. Also fix some benign incorrectness when op0 is a reg but isn't a def that was guarded against by checking for the extension opcodes. llvm-svn: 343821
* [InstCombine] drop poison flags in SimplifyVectorDemandedEltsSanjay Patel2018-10-041-2/+5
| | | | | | | | | | | | | | | We established the (unfortunately complicated) rules for UB/poison propagation with vector ops in: D48893 D48987 D49047 It's clear from the affected tests that we are potentially creating poison where none existed before the transforms. For add/sub/mul, the answer is simple: just drop the flags because the extra undef vector lanes are generally more valuable for analysis and codegen. llvm-svn: 343819
* [X86][LegalizeVectorOps] Use MERGE_VALUES to return two results from ↵Craig Topper2018-10-042-22/+9
| | | | | | | | | | LowerLoad. Remove special case code in LegalizeVectorOps that allowed us to only return one result. Previously we replaced the chain use ourself and return the data result. LegalizeVectorOps then detected that we'd done this and assumed the chain had already been handled. This commit instead returns a MERGE_VALUES node with two results joined from nodes. This allows LegalizeVectorOps to do all the replacements for us without any special casing. The MERGE_VALUES will be removed by DAG combine. llvm-svn: 343817
* [SimplifyCFG] Change recursive calls to llvm::SimplifyCFG to instead use an ↵Craig Topper2018-10-041-29/+54
| | | | | | | | | | | | | | | | | | | outer while loop to revisit. Summary: The llvm::SimplifyCFG function creates a SimplifyCFGOpt object and calls run on it. There were numerous places reached from this run function that called back out llvm::SimplifyCFG which would create another SimplifyCFGOpt object. This is an inefficient use of stack space at minimum. We are also not passing along the LoopHeaders pointer passed into the outer llvm::SimplifyCFG call. So if its not null we lose it on the first recursion and get nullptr from there on. This patch adds an outer loop around the main BasicBlock simplifying code and adds a flag to the SimplifyCFGOpt class that can be set by to request another iteration. I don't think we can iterate based just on the change flag alone since some of the simplifications delete a basic block entirely leaving nothing to iterate on. Reviewers: bogner, eli.friedman, reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52760 llvm-svn: 343816
* [WebAssembly] Don't modify preds/succs iterators while erasing from themHeejin Ahn2018-10-041-4/+7
| | | | | | | | | | | | | | Summary: This caused out-of-bound bugs. Found by `-DLLVM_ENABLE_EXPENSIVE_CHECKS=ON`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52902 llvm-svn: 343814
* AMDGPU: Rename isAmdCodeObjectV2 -> isAmdHsaOrMesaKonstantin Zhuravlyov2018-10-045-18/+16
| | | | | | | | | | | | The isAmdCodeObjectV2 is a misleading name which actually checks whether the os is amdhsa or mesa. Also add a test to make sure we do not generate old kernel header for code object v3. Differential Revision: https://reviews.llvm.org/D52897 llvm-svn: 343813
* [COFF] [X86] Don't use llvm_unreachable for unsupported relocation typesMartin Storsjo2018-10-041-2/+4
| | | | | | | | | | | | | | | | This can happen if assembling a reference to _GLOBAL_OFFSET_TABLE_. While it doesn't make sense to try to assemble that for COFF, the fact that we previously used llvm_unreachable meant that the code had undefined behaviour if something tried to assemble that. The configure script of libgmp would try to assemble such a snippet (which should signal a failure). If llvm is built without assertions, the undefined behaviour meant a (near) infinite loop. Differential Revision: https://reviews.llvm.org/D52903 llvm-svn: 343811
* [InstCombine] reduce code duplication in SimplifyDemandedVectorElts; NFCISanjay Patel2018-10-041-91/+42
| | | | llvm-svn: 343806
* Give same-named members unique timestamps on Darwin in llvm-ar.James Y Knight2018-10-041-7/+70
| | | | | | | | | | | | | This change ensures that the (membername,timestamp) tuple uniquely identifies an entry in an archive for format=darwin, in deterministic mode (which is the default). That, then, enables lldb and dsymutil to locate the appropriate object within the archive. Differential Revision: https://reviews.llvm.org/D47659 llvm-svn: 343805
* [globalisel][combine] Improve the truncate placement for the extending-loads ↵Daniel Sanders2018-10-041-28/+64
| | | | | | | | | | | | | | | | | | combine This brings the extending loads patch back to the original intent but minus the PHI bug and with another small improvement to de-dupe truncates that are inserted into the same block. The truncates are sunk to their uses unless this would require inserting before a phi in which case it sinks to the _beginning_ of the predecessor block for that path (but no earlier than the def). The reason for choosing the beginning of the predecessor is that it makes de-duping multiple truncates in the same block simple, and optimized code is going to run a scheduler at some point which will likely change the position anyway. llvm-svn: 343804
* AArch64: Fix XSeqPairs/WSeqPairs problemsMatthias Braun2018-10-041-18/+68
| | | | | | | | | | - Fix spill/reloads of XSeqPairs failing with vregs (only physregs worked correctly) - Add missing spill/reload code for WSeqPairs class Differential Revision: https://reviews.llvm.org/D52761 llvm-svn: 343799
OpenPOWER on IntegriCloud