summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [WebAssembly] WebAssemblyLowerEmscriptenEHSjLj: use getter/setter for ↵Sam Clegg2018-11-201-40/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | accessing tempRet0 Rather than assuming that `tempRet0` exists in linear memory only assume the getter/setter functions exist. This avoids conflicting with binaryen which declares a wasm global for this purpose and defines it's own getter and setter for that. The other advantage of doing things this way is that it leaving it up to the linker/finalizer to decide how to actually store this temporary. As it happens binaryen uses a wasm global which is more appropriate since it is thread safe. This also allows us to change the way this is stored in the future (memory, TLS memory, wasm global) without modifying LLVM. This is part of a 4 part change: LLVM: https://reviews.llvm.org/D53240 fastcomp: https://github.com/kripken/emscripten-fastcomp/pull/237 emscripten: https://github.com/kripken/emscripten/pull/7358 binaryen: https://github.com/WebAssembly/binaryen/pull/1709 Differential Revision: https://reviews.llvm.org/D53240 llvm-svn: 347340
* [InstSimplify] fold funnel shifts with undef operandsSanjay Patel2018-11-201-1/+10
| | | | | | | | Splitting these off from the D54666. Patch by: nikic (Nikita Popov) llvm-svn: 347332
* [InstructionSimplify] Add support for saturating add/subSanjay Patel2018-11-201-0/+34
| | | | | | | | | | | | | | | | | | | | | | Add support for saturating add/sub in InstructionSimplify. In particular, the following simplifications are supported: sat(X + 0) -> X sat(X + undef) -> -1 sat(X uadd MAX) -> MAX (and commutative variants) sat(X - 0) -> X sat(X - X) -> 0 sat(X - undef) -> 0 sat(undef - X) -> 0 sat(0 usub X) -> 0 sat(X usub MAX) -> 0 Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54532 llvm-svn: 347330
* [ConstantFolding] Add support for saturating add/subSanjay Patel2018-11-201-0/+12
| | | | | | | | | | Support saturating add/sub in constant folding, based on the APInt methods introduced in D54332. Patch by: @nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54531 llvm-svn: 347328
* [LoopSink] Add preheader to alias setGuozhi Wei2018-11-201-0/+1
| | | | | | | | | | This patch fixes PR39695. The original LoopSink only considers memory alias in loop body. But PR39695 shows that instructions following sink candidate in preheader should also be checked. This is a conservative patch, it simply adds whole preheader block to alias set. It may lose some optimization opportunity, but I think that is very rare because: 1 in the most common case st/ld to the same address, the load should already be optimized away. 2 usually preheader is not very large. Differential Revision: https://reviews.llvm.org/D54659 llvm-svn: 347325
* [APInt] Add methods for saturated add and subSanjay Patel2018-11-201-0/+36
| | | | | | | | | | | | This adds the sadd_sat, uadd_sat, ssub_sat, usub_sat methods for performing saturating additions and subtractions to APInt. Split out from D54237. Patch by: nikic (Nikita Popov) Differential Revision: https://reviews.llvm.org/D54332 llvm-svn: 347324
* [DAGCombine] Add calls to SimplifyDemandedVectorElts from ↵Simon Pilgrim2018-11-202-1/+5
| | | | | | | | visitINSERT_SUBVECTOR (PR37989) This uncovered an off-by-one typo in SimplifyDemandedVectorElts's INSERT_SUBVECTOR handling as its bounds check was bailing on safe indices. llvm-svn: 347313
* [PowerPC] Add Itineraries for STWU/STWUX etcJinsong Ji2018-11-2015-54/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | When doing some instruction scheduling work, we noticed some missing itineraries. Before we switch to machine scheduler, those missing itineraries might not have impact to actually scheduling, because we can still get same latency due to default values. With machine scheduler, however, itineraries will have impact to scheduling. eg: NumMicroOps will default to be 0 if there is NO itineraries for specific instruction class. And most of the instruction class with itineraries will have NumMicroOps default to 1. This will has impact on the count of RetiredMOps, affects the Pending/Available Queue, then causing different scheduling or suboptimal scheduling further. This patch is for STWU/STWUX (IIC_LdStStoreUpd ) for P8. Since there are already multiple IIC for store update, this patch also merge IIC_LdStSTDU/IIC_LdStStoreUpd to IIC_LdStSTU IIC_LdStSTDUX to IIC_LdStSTUX and we add a new testcase in https://reviews.llvm.org/D54699 to show the difference. Differential Revision: https://reviews.llvm.org/D54700 llvm-svn: 347311
* Fix MSVC 'truncation of constant value' warning. NFCI.Simon Pilgrim2018-11-201-1/+1
| | | | llvm-svn: 347308
* [X86][SSE] Add computeKnownBits/ComputeNumSignBits support for PACKSS/PACKUS ↵Simon Pilgrim2018-11-201-26/+53
| | | | | | | | instructions. Pull out getPackDemandedElts demanded elts remapping helper from computeKnownBitsForTargetNode and use in computeKnownBits/ComputeNumSignBits. llvm-svn: 347303
* [X86][SSE] XFormVExtractWithShuffleIntoLoad - getVectorShuffle won't accept ↵Simon Pilgrim2018-11-201-2/+6
| | | | | | | | SM_SentinelZero Noticed while working on improving demanded elts target shuffle shuffle combining llvm-svn: 347302
* [TargetLowering] Improve SimplifyDemandedVectorElts/SimplifyDemandedBits supportSimon Pilgrim2018-11-201-0/+17
| | | | | | | | | | For bitcast nodes from larger element types, add the ability for SimplifyDemandedVectorElts to call SimplifyDemandedBits by merging the elts mask to a bits mask. I've raised https://bugs.llvm.org/show_bug.cgi?id=39689 to deal with the few places where SimplifyDemandedBits's lack of vector handling is a problem. Differential Revision: https://reviews.llvm.org/D54679 llvm-svn: 347301
* [X86][SSE] Lower immediately to PACKUS instead of VECTOR_SHUFFLE.Simon Pilgrim2018-11-201-12/+4
| | | | | | As discussed on rL347240, this avoids some regressions on D54679 and also helps some combines to kick in a bit earlier. llvm-svn: 347300
* [X86][SSE] Add SimplifyDemandedVectorElts support for PACKSS/PACKUS ↵Simon Pilgrim2018-11-201-0/+30
| | | | | | | | instructions. As discussed on rL347240. llvm-svn: 347299
* [X86] Preserve undef information when creating a punpckl/hbw from a v16i8 ↵Craig Topper2018-11-201-0/+4
| | | | | | | | | | | | where all the even or odd elements are undef. Previously if V2 was unused we ended up using V1 for both inputs as part of the code that follows the new code. By using lowerVectorShuffleWithUNPCK we keep the undef nature of V2 in the output. As near as I can tell this makes v16i8 behavior consistent with every other VT now. This does mean that we give the register allocator freedom to fill in random registers now and create false dependencies. But like I said we're already doing that for other types. llvm-svn: 347296
* [X86] Add custom type legalization for v8i8->v8i32 sign extend pre-SSE4.1Craig Topper2018-11-201-0/+33
| | | | | | This helps with a future patch and makes us less reliant on DAG combine merging shuffles. llvm-svn: 347295
* [X86] Replace more calls to getZeroVector with regular getConstant.Craig Topper2018-11-201-23/+20
| | | | | | | | getZeroVector produces a specifically canonicalized zero vector, but we can just let DAG legalization take care of it. The test changes are because MULH lowering happens later than it should and this change gave us the opportunity to constant fold away a multiply during a DAG combine before the build_vector got legalized with a bitcast. llvm-svn: 347290
* Recommit "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches ↵Max Kazantsev2018-11-201-0/+315
| | | | | | | | | | | and switches" The initial version of patch lacked Phi nodes updates in destinations of removed edges. This version contains this update and tests on this situation. Differential Revision: https://reviews.llvm.org/D54021 llvm-svn: 347289
* [PowerPC] Don't combine to bswap store on 1-byte truncating storeNemanja Ivanovic2018-11-201-2/+3
| | | | | | | | | | Turns out that there was no check for a store that truncates down to a single byte when combining a (store (bswap...)) into a byte-swapping store. This patch just adds that check. Fixes https://bugs.llvm.org/show_bug.cgi?id=39478. llvm-svn: 347288
* [SelectionDAG] Compute known bits and num sign bits for live out vector ↵Craig Topper2018-11-202-4/+4
| | | | | | | | | | | | | | | | | | | registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks Summary: We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply. This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362 Reviewers: spatel, efriedma, RKSimon, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D54725 llvm-svn: 347287
* [ExecutionEngine][Interpreter] Fix out-of-bounds array access.Lang Hames2018-11-201-1/+2
| | | | | | | | | | If args is empty then accesing element 0 is illegal. https://reviews.llvm.org/D53556 Patch by Eugene Sharygin. Thanks Eugene! llvm-svn: 347281
* [DAGCombiner] reduce code duplication in visitXOR; NFCSanjay Patel2018-11-201-32/+29
| | | | llvm-svn: 347278
* [WebAssembly] Remove unused function return types (NFC)Heejin Ahn2018-11-201-6/+4
| | | | | | | | | | Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54734 llvm-svn: 347277
* [CodeView] Don't print PointerAttributes when dumping.Zachary Turner2018-11-201-1/+0
| | | | | | | | PointerAttributes is a bitwise-or of several other fields, each of which is already printed on its own line with a better explanation. So this doesn't really help much. llvm-svn: 347275
* Implement computeKnownBits for scalar_to_vectorStanislav Mekhanoshin2018-11-191-0/+13
| | | | | | Differential Revision: https://reviews.llvm.org/D54728 llvm-svn: 347274
* [Transforms] Prefer static and avoid namespaces, NFCReid Kleckner2018-11-191-10/+6
| | | | | | | | | | | | | Put 'static' on three functions in an anonymous namespace as per our coding style. Remove the 'namespace llvm {}' around the .cpp file and explicitly declare the free function 'llvm::optimizeGlobalCtorsList' in 'llvm::'. I prefer this style for free functions because the compiler will error out if the .h and .cpp files don't agree on the function name or prototype. llvm-svn: 347269
* [X86] Rename combineVSZext->combineExtendVectorInreg. NFCCraig Topper2018-11-191-4/+4
| | | | | | Now that we no longer have target specific vector extend nodes let's make the function name match the nodes we do use. llvm-svn: 347268
* AMDGPU: Fix V_FMA_F16 selection on GFX9Konstantin Zhuravlyov2018-11-191-2/+8
| | | | | | | | GFX9 should select opsel version. Differential Revision: https://reviews.llvm.org/D54545 llvm-svn: 347265
* Revert "[LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches ↵Benjamin Kramer2018-11-191-313/+0
| | | | | | | | and switches" This reverts commits r347183 & r347184. Crashes while building libxml. llvm-svn: 347260
* [AMDGPU] Restored selection of scalar_to_vector (v2x16)Stanislav Mekhanoshin2018-11-191-9/+9
| | | | | | | | | This works if DAG combiner is enabled, but without combining we cannot select scalar_to_vector of <2 x half> and <2 x i16>. Differential Revision: https://reviews.llvm.org/D54718 llvm-svn: 347259
* [InstCombine] Set debug loc on `mergeStoreIntoSuccessor` phiVedant Kumar2018-11-191-2/+6
| | | | | | | | | Assigning a merged debug location to the `mergeStoreIntoSuccessor` phi improves backtrace quality. Fixes llvm.org/PR38083. llvm-svn: 347257
* [IR] Add hasNPredecessors, hasNPredecessorsOrMore to BasicBlockVedant Kumar2018-11-198-22/+19
| | | | | | | | | | | | | | | | | | | | | | | Add methods to BasicBlock which make it easier to efficiently check whether a block has N (or more) predecessors. This can be more efficient than using pred_size(), which is a linear time operation. We might consider adding similar methods for successors. I haven't done so in this patch because succ_size() is already O(1). With this patch applied, I measured a 0.065% compile-time reduction in user time for running `opt -O3` on the sqlite3 amalgamation (30 trials). The change in mergeStoreIntoSuccessor alone saves 45 million linked list iterations in a stage2 Release build of llc. See llvm.org/PR39702 for a harder but more general way of achieving similar results. Differential Revision: https://reviews.llvm.org/D54686 llvm-svn: 347256
* [DAGCombine] SimplifyNodeWithTwoResults - ensure same legalization for LO/HI ↵Simon Pilgrim2018-11-191-8/+6
| | | | | | | | | | operands (PR21207) Consistently use (!LegalOperations || isOperationLegalOrCustom) for all node pairs. Differential Revision: https://reviews.llvm.org/D53478 llvm-svn: 347255
* Fix Wdocumentation warning. NFCI.Simon Pilgrim2018-11-191-1/+1
| | | | llvm-svn: 347253
* Fix unused function warning.Simon Pilgrim2018-11-191-6/+0
| | | | llvm-svn: 347252
* [TargetLowering] expandFP_TO_UINT - improve fp16 supportSimon Pilgrim2018-11-191-10/+18
| | | | | | | | | | As discussed on D53794, for float types with ranges smaller than the destination integer type, then we should be able to just use a regular FP_TO_SINT opcode. I thought we'd need to provide MSA test cases for very small integer types as well (fp16 -> i8 etc.), but it turns out that promotion will kick in so they're unnecessary. Differential Revision: https://reviews.llvm.org/D54703 llvm-svn: 347251
* Add missing stream operator for Polynomial class to fix debug builds.Simon Pilgrim2018-11-191-0/+7
| | | | llvm-svn: 347249
* [X86][CostModel] Don't lookup intrinsic cost tables if the intrinsic isn't ↵Craig Topper2018-11-191-60/+64
| | | | | | | | | | | | one we care about We're seeing some issues internally where we sent some intrinsics into the cost model that the getTypeLegalizationCost call fails on, but X86 specific tables don't care about. Our base class implementation takes care of them. We'd just like X86 backend to ignore them. This patch makes sure the switch returned something X86 cares about and skips the table lookups and type legalization call if not. Probably more efficient too since we don't go scanning the tables for every intrinsic we could possibly see. Differential Revision: https://reviews.llvm.org/D54711 llvm-svn: 347248
* [X86][SSE] Remove unnecessary bit-and in pshufb vector ctlz (PR39703)Simon Pilgrim2018-11-191-2/+1
| | | | | | | | SSE PSHUFB vector ctlz lowering works at the i4 nibble level. As detailed in PR39703, we were masking the lower nibble off but we only actually use it in the case where the upper nibble is known to be zero, making it safe to remove the mask and save an instruction. Differential Revision: https://reviews.llvm.org/D54707 llvm-svn: 347242
* [InterleavedLoadCombine] Fix warningsMartin Elshuber2018-11-191-6/+1
| | | | | | | * remove unused function * fix compare llvm-svn: 347241
* [X86] Attempt to improve v32i8/v64i8 multiply lowering by applying the v16i8 ↵Craig Topper2018-11-191-15/+13
| | | | | | | | | | | | | | non-avx2 algorithm to each 128-bit lane. Previously we split the vectors in half to allow the two halves to be any extended then concatenated the results back together. This patch instead instead extends the v16i8 sse algorithm to extend half of each 128-bit lane using punpcklbw/punpckhbw. Multiplies all the low half lanes and high half lanes together in separate operations. Then merges the half lane results back together using packuswb. Unfortunately, some of the cases in vector-reduce-mul.ll regress because we aren't narrowing the vector width of the multiplies as we reduce. The splitting was somewhat making up for that before by causing halves to be discarded after the split. Differential Revision: https://reviews.llvm.org/D54668 llvm-svn: 347240
* [DebugInfo] DISubprogram flags get their own flags word. NFC.Paul Robinson2018-11-197-87/+84
| | | | | | | | | | | | | This will hold flags specific to subprograms. In the future we could potentially free up scarce bits in DIFlags by moving subprogram-specific flags from there to the new flags word. This patch does not change IR/bitcode formats, that will be done in a follow-up. Differential Revision: https://reviews.llvm.org/D54597 llvm-svn: 347239
* [AMDGPU] Fix -Wunused-variableFangrui Song2018-11-191-1/+0
| | | | llvm-svn: 347234
* [AMDGPU] Convert insert_vector_elt into set of selectsStanislav Mekhanoshin2018-11-192-0/+41
| | | | | | | | | This allows to avoid scratch use or indirect VGPR addressing for small vectors. Differential Revision: https://reviews.llvm.org/D54606 llvm-svn: 347231
* [InterleavedLoadCombine] Fix warning unused variableMartin Elshuber2018-11-191-2/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D52653 llvm-svn: 347229
* [WebAssembly] replaced .param/.result by .functypeWouter van Oortmerssen2018-11-198-140/+121
| | | | | | | | | | | | | | | | | | | | | Summary: This makes it easier/cleaner to generate a single signature from this directive. Also: - Adds the symbol name, such that we don't depend on the location of this directive anymore. - Actually constructs the signature in the assembler, and make the assembler own it. - Refactor the use of MVT vs ValType in the streamer and assembler to require less conversions overall. - Changed 700 or so tests to use it. Reviewers: sbc100, dschuff Subscribers: jgravelle-google, eraman, aheejin, sunfish, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D54652 llvm-svn: 347228
* [SelectionDAG] simplify vector select with undef operand(s)Sanjay Patel2018-11-192-5/+12
| | | | llvm-svn: 347227
* [InterleavedLoadCombine] Remove unused include. NFC.Benjamin Kramer2018-11-191-1/+0
| | | | llvm-svn: 347226
* Revert "[LICM] Make LICM able to hoist phis"Benjamin Kramer2018-11-191-302/+15
| | | | | | This reverts commit r347190. llvm-svn: 347225
* [AMDGPU] Derive GCNSubtarget from MF to get overridden target featuresDavid Stuttard2018-11-191-2/+2
| | | | | | | | | | | | | | | | | | Summary: AMDGPUAsmPrinter has a getSTI function that derives a GCNSubtarget from the TM. However, this means that overridden target features are not detected and can result in incorrect behaviour. Switch to using STM which is a GCNSubtarget derived from the MF (used elsewhere in the same function). Change-Id: Ib6328ad667b7fcdc87e9c06344e59859207db9b0 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D54301 llvm-svn: 347221
OpenPOWER on IntegriCloud