summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [nios2] Add missing Nios2CodeGen -> Nios2AsmPrinter linkageMichal Gorny2018-11-211-0/+1
| | | | | | | | | | | | | | | | Add missing linkage from Nios2CodeGen library to Nios2AsmPrinter library. The missing dependency causes shared-lib build to fail with the following reason: lib/Target/Nios2/CMakeFiles/LLVMNios2CodeGen.dir/Nios2AsmPrinter.cpp.o: In function `(anonymous namespace)::Nios2AsmPrinter::PrintAsmMemoryOperand(llvm::MachineInstr const*, unsigned int, unsigned int, char const*, llvm::raw_ostream&)': Nios2AsmPrinter.cpp:(.text._ZN12_GLOBAL__N_115Nios2AsmPrinter21PrintAsmMemoryOperandEPKN4llvm12MachineInstrEjjPKcRNS1_11raw_ostreamE+0x2b): undefined reference to `llvm::Nios2InstPrinter::getRegisterName(unsigned int)' lib/Target/Nios2/CMakeFiles/LLVMNios2CodeGen.dir/Nios2AsmPrinter.cpp.o: In function `(anonymous namespace)::Nios2AsmPrinter::PrintAsmOperand(llvm::MachineInstr const*, unsigned int, unsigned int, char const*, llvm::raw_ostream&)': Nios2AsmPrinter.cpp:(.text._ZN12_GLOBAL__N_115Nios2AsmPrinter15PrintAsmOperandEPKN4llvm12MachineInstrEjjPKcRNS1_11raw_ostreamE+0x97): undefined reference to `llvm::Nios2InstPrinter::getRegisterName(unsigned int)' collect2: error: ld returned 1 exit status Differential Revision: https://reviews.llvm.org/D47810 llvm-svn: 347387
* [X86][AVX] Remove BROADCAST if we only need the 0'th elementSimon Pilgrim2018-11-211-0/+7
| | | | | | We don't catch this with target shuffle simplification if the src/dst types are different. llvm-svn: 347386
* Test commit: Delete trailing space in commentNikita Popov2018-11-211-1/+1
| | | | llvm-svn: 347385
* [X86] In getScalarMaskingNode, replace scalar_to_vector with a bitcast to ↵Craig Topper2018-11-211-1/+3
| | | | | | | | v8i1 and an extract_subvector to convert i8 to v1i1. The bitcast can be nicely merged with any i8 loads that exist for argument passing in 32 mode for example. llvm-svn: 347380
* [LVI] run transfer function for binary operator even when the RHS isn't a ↵John Regehr2018-11-211-36/+39
| | | | | | | | | | | | | constant LVI was symbolically executing binary operators only when the RHS was constant, missing the case where we have a ConstantRange for the RHS, but not an actual constant. Tested using check-all and by bootstrapping. Compile time is not impacted measurably. Differential Revision: https://reviews.llvm.org/D19859 llvm-svn: 347379
* [PowerPC] Do not use vectors to codegen bswap with Altivec turned offNemanja Ivanovic2018-11-211-2/+4
| | | | | | | | | | | | | | We have efficient codegen on P9 for lowering bswap that involves moving the value into a vector reg and moving it back. However, the check under which we custom lowered it did not adequately reflect the actual requirements. It required only that the subtarget be an implementation of ISA 3.0 since all compliant implementations have to provide the vector instructions. However, the kernel builds have a valid use case for -mno-altivec -mcpu=pwr9 (i.e. don't emit vector code, don't have to save vector regs for context switch). So we should require the correct features for this lowering. Fixes https://bugs.llvm.org/show_bug.cgi?id=39334 llvm-svn: 347376
* [X86] Correct 256 vpmovzx/vpmovsx isel patterns to check HasAVX2 instead of ↵Craig Topper2018-11-211-8/+8
| | | | | | | | | | HasAVX to prevent fast-isel from using them incorrectly. These are AVX2 instructions, but have been incorrectly marked in tablegen for a while. This wasn't a problem until r346784 switched the patterns to use target independent ISD opcodes. This made the patterns visible to fast isel. Fixes PR39733 llvm-svn: 347375
* [X86] Emit a PACKUS instead of a VECTOR_SHUFFLE from LowerTRUNCATE for ↵Craig Topper2018-11-201-7/+2
| | | | | | | | | | v16i16->v16i8. We can't guarantee that demanded bits passing through the vector shuffle won't cause the AND in front of this to be removed. This would prevent the PACKUS from being matched during shuffle lowering. Unfortunately, this adds a packuswb to one of the vector-reduce-mul.ll tests since we were removing the shuffle via SimplifyDemandedVectorElts. We appear to have similar issues with vpmovwb on the same test case on other targets. llvm-svn: 347361
* [DAGCombiner] look through bitcasts when trying to narrow vector binopsSanjay Patel2018-11-201-13/+14
| | | | | | | | | | | | | | | | | | | This is another step in vector narrowing - a follow-up to D53784 (and hoping to eventually squash potential regressions seen in D51553). The x86 test diffs are wins, but the AArch64 diff is probably not. That problem already exists independent of this patch (see PR39722), but it went unnoticed in the previous patch because there were no regression tests that showed the possibility. The x86 diff in i64-mem-copy.ll is close. Given the frequency throttling concerns with using wider vector ops, an extra extract to reduce vector width is the right trade-off at this level of codegen. Differential Revision: https://reviews.llvm.org/D54392 llvm-svn: 347356
* [CodeView] Add support for ref-qualified member functions.Zachary Turner2018-11-203-21/+51
| | | | | | | | | | | | | | | | | | | | | | When you have a member function with a ref-qualifier, for example: struct Foo { void Func() &; void Func2() &&; }; clang-cl was not emitting this information. Doing so is a bit awkward, because it's not a property of the LF_MFUNCTION type, which is what you'd expect. Instead, it's a property of the this pointer which is actually an LF_POINTER. This record has an attributes bitmask on it, and our handling of this bitmask was all wrong. We had some parts of the bitmask defined incorrectly, but importantly for this bug, we didn't know about these extra 2 bits that represent the ref qualifier at all. Differential Revision: https://reviews.llvm.org/D54667 llvm-svn: 347354
* [CodeView] Mark this pointers as const.Zachary Turner2018-11-201-0/+3
| | | | | | | | | | | This is for compatibility with MSVC, which also marks this pointers as being const-qualified. Fixes llvm.org/pr36526 Differential Revision: https://reviews.llvm.org/D54736 llvm-svn: 347353
* [X86] Emit a single shuffle for the v16i8->v4i32 step of a ↵Craig Topper2018-11-201-54/+28
| | | | | | | | | | SIGN_EXTEND_VECTOR_INREG lowering on pre-sse4.1 targets. Previously we emitted to separate shuffles, one for unpcklbw and one for unpcklwd. Instead emit a single shuffle equivalent to both of the original shuffles. Shuffle lowering seems able to handle it. This avoids a bitcast between the two shuffles which seems helpful to DAG combine. Remove the custom type legalization for v8i8->v8i32. I had put that in to avoid some almost duplicate punpcklbw instructions I was seeing, but this lowering change seems to fix that. It also fixes some duplicate shuffles seen in vector-sext.ll llvm-svn: 347348
* [WebAssembly] WebAssemblyLowerEmscriptenEHSjLj: use getter/setter for ↵Sam Clegg2018-11-201-40/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | accessing tempRet0 Rather than assuming that `tempRet0` exists in linear memory only assume the getter/setter functions exist. This avoids conflicting with binaryen which declares a wasm global for this purpose and defines it's own getter and setter for that. The other advantage of doing things this way is that it leaving it up to the linker/finalizer to decide how to actually store this temporary. As it happens binaryen uses a wasm global which is more appropriate since it is thread safe. This also allows us to change the way this is stored in the future (memory, TLS memory, wasm global) without modifying LLVM. This is part of a 4 part change: LLVM: https://reviews.llvm.org/D53240 fastcomp: https://github.com/kripken/emscripten-fastcomp/pull/237 emscripten: https://github.com/kripken/emscripten/pull/7358 binaryen: https://github.com/WebAssembly/binaryen/pull/1709 Differential Revision: https://reviews.llvm.org/D53240 llvm-svn: 347340
* [InstSimplify] fold funnel shifts with undef operandsSanjay Patel2018-11-201-1/+10
| | | | | | | | Splitting these off from the D54666. Patch by: nikic (Nikita Popov) llvm-svn: 347332
* [InstructionSimplify] Add support for saturating add/subSanjay Patel2018-11-201-0/+34
| | | | | | | | | | | | | | | | | | | | | | Add support for saturating add/sub in InstructionSimplify. In particular, the following simplifications are supported: sat(X + 0) -> X sat(X + undef) -> -1 sat(X uadd MAX) -> MAX (and commutative variants) sat(X - 0) -> X sat(X - X) -> 0 sat(X - undef) -> 0 sat(undef - X) -> 0 sat(0 usub X) -> 0 sat(X usub MAX) -> 0 Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54532 llvm-svn: 347330
* [ConstantFolding] Add support for saturating add/subSanjay Patel2018-11-201-0/+12
| | | | | | | | | | Support saturating add/sub in constant folding, based on the APInt methods introduced in D54332. Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54531 llvm-svn: 347328
* [LoopSink] Add preheader to alias setGuozhi Wei2018-11-201-0/+1
| | | | | | | | | | This patch fixes PR39695. The original LoopSink only considers memory alias in loop body. But PR39695 shows that instructions following sink candidate in preheader should also be checked. This is a conservative patch, it simply adds whole preheader block to alias set. It may lose some optimization opportunity, but I think that is very rare because: 1 in the most common case st/ld to the same address, the load should already be optimized away. 2 usually preheader is not very large. Differential Revision: https://reviews.llvm.org/D54659 llvm-svn: 347325
* [APInt] Add methods for saturated add and subSanjay Patel2018-11-201-0/+36
| | | | | | | | | | | | This adds the sadd_sat, uadd_sat, ssub_sat, usub_sat methods for performing saturating additions and subtractions to APInt. Split out from D54237. Patch by: nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54332 llvm-svn: 347324
* [DAGCombine] Add calls to SimplifyDemandedVectorElts from ↵Simon Pilgrim2018-11-202-1/+5
| | | | | | | | visitINSERT_SUBVECTOR (PR37989) This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices. llvm-svn: 347313
* [PowerPC] Add Itineraries for STWU/STWUX etcJinsong Ji2018-11-2015-54/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. This patch is for STWU/STWUX (IIC_LdStStoreUpd ) for P8. Since there are already multiple IIC for store update, this patch also merge IIC_LdStSTDU/IIC_LdStStoreUpd to IIC_LdStSTU IIC_LdStSTDUX to IIC_LdStSTUX and we add a new testcase in https://reviews.llvm.org/D54699 to show the difference. Differential Revision: https://reviews.llvm.org/D54700 llvm-svn: 347311
* Fix MSVC 'truncation of constant value' warning. NFCI.Simon Pilgrim2018-11-201-1/+1
| | | | llvm-svn: 347308
* [X86][SSE] Add computeKnownBits/ComputeNumSignBits support for PACKSS/PACKUS ↵Simon Pilgrim2018-11-201-26/+53
| | | | | | | | instructions. Pull out getPackDemandedElts demanded elts remapping helper from computeKnownBitsForTargetNode and use in computeKnownBits/ComputeNumSignBits. llvm-svn: 347303
* [X86][SSE] XFormVExtractWithShuffleIntoLoad - getVectorShuffle won't accept ↵Simon Pilgrim2018-11-201-2/+6
| | | | | | | | SM_SentinelZero Noticed while working on improving demanded elts target shuffle shuffle combining llvm-svn: 347302
* [TargetLowering] Improve SimplifyDemandedVectorElts/SimplifyDemandedBits supportSimon Pilgrim2018-11-201-0/+17
| | | | | | | | | | For bitcast nodes from larger element types, add the ability for SimplifyDemandedVectorElts to call SimplifyDemandedBits by merging the elts mask to a bits mask. I've raised https://bugs.llvm.org/show_bug.cgi?id=39689 to deal with the few places where SimplifyDemandedBits's lack of vector handling is a problem. Differential Revision: https://reviews.llvm.org/D54679 llvm-svn: 347301
* [X86][SSE] Lower immediately to PACKUS instead of VECTOR_SHUFFLE.Simon Pilgrim2018-11-201-12/+4
| | | | | | As discussed on rL347240, this avoids some regressions on D54679 and also helps some combines to kick in a bit earlier. llvm-svn: 347300
* [X86][SSE] Add SimplifyDemandedVectorElts support for PACKSS/PACKUS ↵Simon Pilgrim2018-11-201-0/+30
| | | | | | | | instructions. As discussed on rL347240. llvm-svn: 347299
* [X86] Preserve undef information when creating a punpckl/hbw from a v16i8 ↵Craig Topper2018-11-201-0/+4
| | | | | | | | | | | | where all the even or odd elements are undef. Previously if V2 was unused we ended up using V1 for both inputs as part of the code that follows the new code. By using lowerVectorShuffleWithUNPCK we keep the undef nature of V2 in the output. As near as I can tell this makes v16i8 behavior consistent with every other VT now. This does mean that we give the register allocator freedom to fill in random registers now and create false dependencies. But like I said we're already doing that for other types. llvm-svn: 347296
* [X86] Add custom type legalization for v8i8->v8i32 sign extend pre-SSE4.1Craig Topper2018-11-201-0/+33
| | | | | | This helps with a future patch and makes us less reliant on DAG combine merging shuffles. llvm-svn: 347295
* [X86] Replace more calls to getZeroVector with regular getConstant.Craig Topper2018-11-201-23/+20
| | | | | | | | getZeroVector produces a specifically canonicalized zero vector, but we can just let DAG legalization take care of it. The test changes are because MULH lowering happens later than it should and this change gave us the opportunity to constant fold away a multiply during a DAG combine before the build_vector got legalized with a bitcast. llvm-svn: 347290
* Recommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches ↵Max Kazantsev2018-11-201-0/+315
| | | | | | | | | | | and switches" The initial version of patch lacked Phi nodes updates in destinations of removed edges. This version contains this update and tests on this situation. Differential Revision: https://reviews.llvm.org/D54021 llvm-svn: 347289
* [PowerPC] Don't combine to bswap store on 1-byte truncating storeNemanja Ivanovic2018-11-201-2/+3
| | | | | | | | | | Turns out that there was no check for a store that truncates down to a single byte when combining a (store (bswap...)) into a byte-swapping store. This patch just adds that check. Fixes https://bugs.llvm.org/show_bug.cgi?id=39478. llvm-svn: 347288
* [SelectionDAG] Compute known bits and num sign bits for live out vector ↵Craig Topper2018-11-202-4/+4
| | | | | | | | | | | | | | | | | | | registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks Summary: We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply. This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362 Reviewers: spatel, efriedma, RKSimon, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D54725 llvm-svn: 347287
* [ExecutionEngine][Interpreter] Fix out-of-bounds array access.Lang Hames2018-11-201-1/+2
| | | | | | | | | | If args is empty then accesing element 0 is illegal. https://reviews.llvm.org/D53556 Patch by Eugene Sharygin. Thanks Eugene! llvm-svn: 347281
* [DAGCombiner] reduce code duplication in visitXOR; NFCSanjay Patel2018-11-201-32/+29
| | | | llvm-svn: 347278
* [WebAssembly] Remove unused function return types (NFC)Heejin Ahn2018-11-201-6/+4
| | | | | | | | | | Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54734 llvm-svn: 347277
* [CodeView] Don't print PointerAttributes when dumping.Zachary Turner2018-11-201-1/+0
| | | | | | | | PointerAttributes is a bitwise-or of several other fields, each of which is already printed on its own line with a better explanation. So this doesn't really help much. llvm-svn: 347275
* Implement computeKnownBits for scalar_to_vectorStanislav Mekhanoshin2018-11-191-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D54728 llvm-svn: 347274
* [Transforms] Prefer static and avoid namespaces, NFCReid Kleckner2018-11-191-10/+6
| | | | | | | | | | | | | Put 'static' on three functions in an anonymous namespace as per our coding style. Remove the 'namespace llvm {}' around the .cpp file and explicitly declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'. I prefer this style for free functions because the compiler will error out if the .h and .cpp files don't agree on the function name or prototype. llvm-svn: 347269
* [X86] Rename combineVSZext->combineExtendVectorInreg. NFCCraig Topper2018-11-191-4/+4
| | | | | | Now that we no longer have target specific vector extend nodes let's make the function name match the nodes we do use. llvm-svn: 347268
* AMDGPU: Fix V_FMA_F16 selection on GFX9Konstantin Zhuravlyov2018-11-191-2/+8
| | | | | | | | GFX9 should select opsel version. Differential Revision: https://reviews.llvm.org/D54545 llvm-svn: 347265
* Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches ↵Benjamin Kramer2018-11-191-313/+0
| | | | | | | | and switches" This reverts commits r347183 & r347184. Crashes while building libxml. llvm-svn: 347260
* [AMDGPU] Restored selection of scalar_to_vector (v2x16)Stanislav Mekhanoshin2018-11-191-9/+9
| | | | | | | | | This works if DAG combiner is enabled, but without combining we cannot select scalar_to_vector of <2 x half> and <2 x i16>. Differential Revision: https://reviews.llvm.org/D54718 llvm-svn: 347259
* [InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phiVedant Kumar2018-11-191-2/+6
| | | | | | | | | Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi improves backtrace quality. Fixes llvm.org/PR38083. llvm-svn: 347257
* [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlockVedant Kumar2018-11-198-22/+19
| | | | | | | | | | | | | | | | | | | | | | | Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256
* [DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI ↵Simon Pilgrim2018-11-191-8/+6
| | | | | | | | | | operands (PR21207) Consistently use (!LegalOperations || isOperationLegalOrCustom) for all node pairs. Differential Revision: https://reviews.llvm.org/D53478 llvm-svn: 347255
* Fix Wdocumentation warning. NFCI.Simon Pilgrim2018-11-191-1/+1
| | | | llvm-svn: 347253
* Fix unused function warning.Simon Pilgrim2018-11-191-6/+0
| | | | llvm-svn: 347252
* [TargetLowering] expandFP_TO_UINT - improve fp16 supportSimon Pilgrim2018-11-191-10/+18
| | | | | | | | | | As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode. I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary. Differential Revision: https://reviews.llvm.org/D54703 llvm-svn: 347251
* Add missing stream operator for Polynomial class to fix debug builds.Simon Pilgrim2018-11-191-0/+7
| | | | llvm-svn: 347249
* [X86][CostModel] Don't lookup intrinsic cost tables if the intrinsic isn't ↵Craig Topper2018-11-191-60/+64
| | | | | | | | | | | | one we care about We're seeing some issues internally where we sent some intrinsics into the cost model that the getTypeLegalizationCost call fails on, but X86 specific tables don't care about. Our base class implementation takes care of them. We'd just like X86 backend to ignore them. This patch makes sure the switch returned something X86 cares about and skips the table lookups and type legalization call if not. Probably more efficient too since we don't go scanning the tables for every intrinsic we could possibly see. Differential Revision: https://reviews.llvm.org/D54711 llvm-svn: 347248
OpenPOWER on IntegriCloud