summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Fold (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C))Craig Topper2018-09-151-0/+26
| | | | | | | | | | | | | | | | | Summary: MOVMSK only care about the sign bit so we don't need the setcc to fill the whole element with 0s/1s. We can just shift the bit we're looking for into the sign bit. This saves a constant pool load. Inspired by PR38840. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D52121 llvm-svn: 342326
* [InstCombine][x86] try harder to convert blendv intrinsic to generic IR ↵Sanjay Patel2018-09-151-7/+15
| | | | | | | | | | | | | | | | | | (PR38814) Missing optimizations with blendv are shown in: https://bugs.llvm.org/show_bug.cgi?id=38814 If this works, it's an easier and more powerful solution than adding pattern matching for a few special cases in the backend. The potential danger with this transform in IR is that the condition value can get separated from the select, and the backend might not be able to make a blendv out of it again. I don't think that's too likely, but I've kept this patch minimal with a 'TODO', so we can test that theory in the wild before expanding the transform. Differential Revision: https://reviews.llvm.org/D52059 llvm-svn: 342324
* [InstCombine] Inefficient pattern for high-bits checking 3 (PR38708)Roman Lebedev2018-09-151-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: It is sometimes important to check that some newly-computed value is non-negative and only n bits wide (where n is a variable.) There are many ways to check that: https://godbolt.org/z/o4RB8D The last variant seems best? (I'm sure there are some other variations i haven't thought of..) The last (as far i know?) pattern, non-canonical due to the extra use. https://godbolt.org/z/aCMsPk https://rise4fun.com/Alive/I6f https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52062 llvm-svn: 342321
* [CodeGenPrepare] Preserve debug locs in OptimizeExtractBitsVedant Kumar2018-09-151-1/+6
| | | | | | | | | CodeGenPrepare has a transform that sinks {lshr, trunc} pairs to make it easier for the backend to emit fancy extract-bits instructions (e.g UBFX). Teach it to preserve debug locations and salvage debug values. llvm-svn: 342319
* [WebAssembly] SIMD shiftsThomas Lively2018-09-151-0/+26
| | | | | | | | | | | | | | | | | Summary: Implement shifts of vectors by i32. Since LLVM defines shifts as binary operations between two vectors, this involves pattern matching on splatted shift operands. For v2i64 shifts any i32 shift operands have to be zero extended in the input and any i64 shift operands have to be wrapped in the output. Depends on D52007. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D51906 llvm-svn: 342302
* [WebAssembly] SIMD negThomas Lively2018-09-141-0/+30
| | | | | | | | | | | | Summary: Depends on D52007. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D52009 llvm-svn: 342296
* [BreakFalseDeps] Fix bad formatting. NFCCraig Topper2018-09-141-1/+1
| | | | llvm-svn: 342293
* [InstCombine] refactor mul narrowing folds; NFCISanjay Patel2018-09-144-112/+60
| | | | | | | | | | | | | Similar to rL342278: The test diffs are all cosmetic due to the change in value naming, but I'm including that to show that the new code does perform these folds rather than something else in instcombine. D52075 should be able to use this code too rather than duplicating all of the logic. llvm-svn: 342292
* [InstCombine] add/use overflowing math helper functions; NFCSanjay Patel2018-09-142-3/+19
| | | | | | | | The mul case can already be refactored to use this similar to rL342278. The sub case is proposed in D52075. llvm-svn: 342289
* [PowerPC] Fix the calling convention for i1 arguments on PPC32Lion Yang2018-09-141-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Integer types smaller than i32 must be extended to i32 by default. The feature "crbits" introduced at r202451 handles i1 as a special case, but it did not extend properly. The caller was, therefore, passing i1 stack arguments by writing 0/1 to the first byte of the 4-byte stack object and callee was reading the first byte for the value. "crbits" is enabled if the optimization level is greater than 1, which is very common in "release builds". Such discrepancies with ABI specification also introduces potential incompatibility with programs or libraries built with other compilers e.g. GCC. Fixes PR38661 Reviewers: hfinkel, cuviper Subscribers: sylvestre.ledru, glaubitz, nagisa, nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D51108 llvm-svn: 342288
* [codeview] Remove dead codeReid Kleckner2018-09-142-17/+0
| | | | llvm-svn: 342285
* [PDB] Refactor a little of the Symbol creation code.Zachary Turner2018-09-143-28/+16
| | | | | | | | | | | Eventually we need to be able to support nested types, which don't have an associated CVType record. To handle this, remove the CVType from all of the record classes, and instead store the deserialized record. Then move the deserialization up to the thing that creates the type. This actually makes error handling better anyway as we can return an invalid symbol instead of asserting false. llvm-svn: 342284
* [SampleFDO] Add FunctionOffsetTable in compact binary format profile.Wei Mi2018-09-144-13/+158
| | | | | | | | | | | | The patch saves a function offset table which maps function name index to the offset of its function profile to the start of the binary profile. By using the function offset table, for those function profiles which will not be used when compiling a module, the profile reader does't have to read them. For profile size around 10~20M, it saves ~10% compile time. Differential Revision: https://reviews.llvm.org/D51863 llvm-svn: 342283
* [InstCombine] refactor add narrowing folds; NFCISanjay Patel2018-09-142-71/+44
| | | | | | | | | The test diffs are all cosmetic due to the change in value naming, but I'm including that to show that the new code does perform these folds rather than something else in instcombine. llvm-svn: 342278
* HotColdSplit: fix invalid SSA due to outliningSebastian Pop2018-09-141-15/+16
| | | | | | | | The test used to fail with an invalid phi node: the two predecessors were outlined and the SSA representation was left invalid. The patch adds the exit block to the cold region. llvm-svn: 342277
* HotColdSplit: fix isSingleEntrySingleExitSebastian Pop2018-09-141-10/+6
| | | | | | | | | | | | remove duplicate entries from isSingleEntrySingleExit: the Entry block is already added by the loop over the dominance frontier. Remove the heuristic from isOutlineCandidate that a region is too small when it only contains a basic block. With this change we now grow regions starting from a block and we continue adding to the ValidColdRegion. Check the heuristic just before code generation. llvm-svn: 342276
* HotColdSplit: add back propagation to extend cold regionsSebastian Pop2018-09-141-18/+64
| | | | | | | | Also fix a problem in forward propagation: const TerminatorInst *TI = It->getTerminator(); was set outside the while loop that iterates over It. llvm-svn: 342275
* Remove unused DIASession fieldReid Kleckner2018-09-142-3/+2
| | | | llvm-svn: 342272
* AMDGPU: Clear the bits before they are being set in program resource registersKonstantin Zhuravlyov2018-09-141-0/+1
| | | | | | Change by Tony Tye llvm-svn: 342270
* Revert r342183 "[DAGCombine] Fix crash when store merging created an ↵Reid Kleckner2018-09-141-8/+1
| | | | | | | | | extract_subvector with invalid index." Causes 'isVector() && "Invalid vector type!"' assertion when building Skia in Chrome. llvm-svn: 342265
* Fix debug info for SelectionDAG legalization of DAG nodes with two results.Adrian Prantl2018-09-141-1/+1
| | | | | | | | | | | | | | | | This patch fixes the debug info handling for SelectionDAG legalization of DAG nodes with two results. When an replaced SDNode has more than one result, transferDbgValues was always copying the SDDbgValue from the first result and attaching them to all members. In reality SelectionDAG::ReplaceAllUsesWith() is given an array of SDNodes (though the type signature doesn't make this obvious (cf. the call site code in ReplaceNode()). rdar://problem/44162227 Differential Revision: https://reviews.llvm.org/D52112 llvm-svn: 342264
* [ThinLTOCodeGenerator] Avoid Rehash StringMap in ThreadPoolSteven Wu2018-09-141-5/+9
| | | | | | | | | | | | | | | | | | | | Summary: During threaded thinLTO, it is possible that the entry for current module doesn't exist in StringMaps (like ExportLists, ResolvedODR, etc.). Using operator[] might trigger a rehash for the StringMap, which might happen on multiple threads at the same time. rdar://problem/43846199 Reviewers: tejohnson, mehdi_amini, kromanova, pcc Reviewed By: tejohnson Subscribers: dang, inglorion, eraman, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D52049 llvm-svn: 342263
* Revert r342210 "[ARM] bottom-top mul support in ARMParallelDSP"Reid Kleckner2018-09-141-152/+27
| | | | | | | | | | It causes assertion failures while building Skia for Android in Chromium: https://ci.chromium.org/buildbot/chromium.clang/ToTAndroid/4550 Reduction forthcoming. llvm-svn: 342260
* [X86][SSE] Lower shuffles to permute(unpack(x,y)) (PR31151)Simon Pilgrim2018-09-141-5/+75
| | | | | | | | | | Attempt to lower a shuffle as an unpack of elements from two inputs followed by a single-input (wider) permutation. As long as the permutation is wider this is a win - there may be some circumstances where same size permutations would also be useful but I've left that for future work. Differential Revision: https://reviews.llvm.org/D52043 llvm-svn: 342257
* fix noasserts buildAdrian Prantl2018-09-141-0/+2
| | | | llvm-svn: 342247
* SelectionDAG: Add compact SDDbgValue representation to -dag-dump-verbose outputAdrian Prantl2018-09-142-0/+19
| | | | llvm-svn: 342245
* fix typosAdrian Prantl2018-09-142-2/+2
| | | | llvm-svn: 342241
* [X86][BMI1] Fix BLSI/BLSMSK/BLSR BMI1 scheduling on btver2Simon Pilgrim2018-09-141-1/+1
| | | | | | These have the same behaviour as tzcnt on btver2 - confirmed with AMD 16h SOG, Agner and instlatx64. llvm-svn: 342235
* [X86][BMI1] Add scheduler class for BLSI/BLSMSK/BLSR BMI1 instructionsSimon Pilgrim2018-09-1411-48/+35
| | | | llvm-svn: 342234
* [AMDGPU] Ensure trig range reduction only used for subtargets that require itDavid Stuttard2018-09-144-9/+28
| | | | | | | | | | | | | | | | | | Summary: GFX9 and above support sin/cos instructions with a greater range and thus don't require a fract instruction prior to invocation. Added a subtarget feature to reflect this and added code to take advantage of expanded range on GFX9+ Also updated the tests to check correct behaviour Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D51933 Change-Id: I1c1f1d3726a5ae32116646ca5cfa1ab4ef69e5b0 llvm-svn: 342222
* [DWARF] reposting r342048, which was reverted in r342056 due to buildbot Wolfgang Pieb2018-09-147-221/+201
| | | | | | | | errors. Adjusted 2 test cases for ARM and darwin and fixed a bug with the original change in dsymutil. llvm-svn: 342218
* [ARM] bottom-top mul support in ARMParallelDSPSam Parker2018-09-141-27/+152
| | | | | | | | | | On failing to find sequences that can be converted into dual macs, try to find sequential 16-bit loads that are used by muls which we can then use smultb, smulbt, smultt with a wide load. Differential Revision: https://reviews.llvm.org/D51983 llvm-svn: 342210
* [LoopInterchange] Preserve ScalarEvolution, by forgetting about interchanged ↵Florian Hahn2018-09-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | loops. As preparation for LoopInterchange becoming a loop pass, it needs to preserve ScalarEvolution. Even though interchanging should not change the trip count of the loop, it modifies loop entry, latch and exit blocks. I added -verify-scev to some loop interchange tests, but the verification does not catch problems caused by missing invalidation of SE in loop interchange, as the trip counts themselves do not change. So there might be potential to make the SE verification covering more stuff in the future. Reviewers: mkazantsev, efriedma, karthikthecool Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D52026 llvm-svn: 342209
* [SystemZ] Adjust cost functions for subtargets that use LI + LOC instead of IPMJonas Paulsson2018-09-141-4/+8
| | | | | | | | | | | | | | | After recent improvements which makes better use of LOC instead of IPM, the TTI cost functions also needs to be updated to reflect this. This involves sext, zext and xor of i1. The tests were updated so that for z13 the new costs are expected, while the old costs are still checked for on zEC12. Review: Ulrich Weigand https://reviews.llvm.org/D51339 llvm-svn: 342207
* [Support] Treat null bytes as separator in windows command line stringsMartin Storsjo2018-09-141-2/+6
| | | | | | | | | | | | | When reading directives from a .drectve section, the directives are tokenized as a normal windows command line. However in these cases, link.exe allows the directives to be separated by null bytes, not only by spaces. A test case for this change will be added in the lld repo. Differential Revision: https://reviews.llvm.org/D52014 llvm-svn: 342204
* [NFC] Remove meaningless code from GVNMax Kazantsev2018-09-141-6/+0
| | | | llvm-svn: 342202
* Fix for the buildbot failure ↵Hideki Saito2018-09-143-4/+11
| | | | | | | | http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/23635 from the commit (r342197) of https://reviews.llvm.org/D50820. llvm-svn: 342201
* [VPlan] Implement initial vector code generation support for simple outer loops.Hideki Saito2018-09-146-15/+287
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: [VPlan] Implement vector code generation support for simple outer loops. Context: Patch Series #1 for outer loop vectorization support in LV using VPlan. (RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html). This patch introduces vector code generation support for simple outer loops that are currently supported in the VPlanNativePath. Changes here essentially do the following: - force vector code generation using explicit vectorize_width - add conservative early returns in cost model and other places for VPlanNativePath - add code for setting up outer loop inductions - support for widening non-induction PHIs that can result from inner loops and uniform conditional branches - support for generating uniform inner branches We plan to add a handful C outer loop executable tests once the initial code generation support is committed. This patch is expected to be NFC for the inner loop vectorizer path. Since we are moving in the direction of supporting outer loop vectorization in LV, it may also be time to rename classes such as InnerLoopVectorizer. Reviewers: fhahn, rengolin, hsaito, dcaballe, mkuper, hfinkel, Ayal Reviewed By: fhahn, hsaito Subscribers: dmgreen, bollu, tschuett, rkruppe, rogfer01, llvm-commits Differential Revision: https://reviews.llvm.org/D50820 llvm-svn: 342197
* [AMDGPU] Removed unused methodTim Renouf2018-09-131-22/+0
| | | | | | | | | | | | | Summary: I accidentally left this behind in D50306, and it causes a build warning when I build with gcc7. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D52022 Change-Id: I30f7a47047e9d9d841f652da66d2fea19e74842c llvm-svn: 342189
* [SanitizerCoverage] Create comdat for global arrays.Matt Morehouse2018-09-131-14/+25
| | | | | | | | | | | | | | | | | Summary: Place global arrays in comdat sections with their associated functions. This makes sure they are stripped along with the functions they reference, even on the BFD linker. Reviewers: eugenis Reviewed By: eugenis Subscribers: eraman, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D51902 llvm-svn: 342186
* [DAGCombine] Fix crash when store merging created an extract_subvector with ↵Amara Emerson2018-09-131-1/+8
| | | | | | | | invalid index. Differential Revision: https://reviews.llvm.org/D51831 llvm-svn: 342183
* [MachineInstr] In addRegisterKilled and addRegisterDead, don't remove ↵Craig Topper2018-09-131-2/+4
| | | | | | | | | | | | | | operands from inline assembly instructions if they have an associated flag operand. INLINEASM instructions use extra operands to carry flags. If a register operand is removed without removing the flag operand, then the flags will no longer make sense. This patch fixes this by preventing the removal when a flag operand is present. The included test case was generated by MS inline assembly. Longer term maybe we should fix the inline assembly parsing to not generate redundant operands. Differential Revision: https://reviews.llvm.org/D51829 llvm-svn: 342176
* [X86] Fix register resizings for inline assembly register operands.Nirav Dave2018-09-132-7/+39
| | | | | | | | | | | | | | When replacing a named register input to the appropriately sized sub/super-register. In the case of a 64-bit value being assigned to a register in 32-bit mode, match GCC's assignment. Reviewers: eli.friedman, craig.topper Subscribers: nickdesaulniers, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D51502 llvm-svn: 342175
* [X86] Cleanup pair returns. NFCI.Nirav Dave2018-09-131-32/+14
| | | | llvm-svn: 342174
* [InstCombine] Inefficient pattern for high-bits checking 2 (PR38708)Roman Lebedev2018-09-131-19/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: It is sometimes important to check that some newly-computed value is non-negative and only n bits wide (where n is a variable.) There are many ways to check that: https://godbolt.org/z/o4RB8D The last variant seems best? (I'm sure there are some other variations i haven't thought of..) More complicated, canonical pattern: https://rise4fun.com/Alive/uhA We do need to have two `switch()`'es like this, to not mismatch the swappable predicates. https://bugs.llvm.org/show_bug.cgi?id=38708 Reviewers: spatel, craig.topper, RKSimon Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D52001 llvm-svn: 342173
* [PartiallyInlineLibCalls] Add DebugCounter supportGeorge Burgess IV2018-09-131-0/+6
| | | | | | | | | | | This adds DebugCounter support to the PartiallyInlineLibCalls pass, which should make debugging/automated bisection easier in the future. Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D50093 llvm-svn: 342172
* [DCE] Add DebugCounter supportGeorge Burgess IV2018-09-131-0/+8
| | | | | | | | Patch by Zhizhou Yang! Differential Revision: https://reviews.llvm.org/D50092 llvm-svn: 342170
* [InstCombine] Fold (xor (min/max X, Y), -1) -> (max/min ~X, ~Y) when X and Y ↵Craig Topper2018-09-132-0/+13
| | | | | | | | | | | | | | are freely invertible. This allows the xor to be removed completely. This might help with recomitting r341674, but seems good regardless. Coincidentally fixes PR38915. Differential Revision: https://reviews.llvm.org/D51964 llvm-svn: 342163
* Common infrastructure for reading a profile remapping file and buildingRichard Smith2018-09-132-0/+82
| | | | | | | | a mangling remapper from it. Differential Revision: https://reviews.llvm.org/D51246 llvm-svn: 342161
* [RISCV][MC] Reject bare symbols for the simm6 and simm6nonzero operand typesAna Pazos2018-09-131-14/+4
| | | | | | | | | | | | | | | | | | | | Summary: Fixed assertions due to invalid fixup when encoding compressed instructions (c.addi, c.addiw, c.li, c.andi) with bare symbols with/without modifiers. This matches GAS behavior as well. This bug was uncovered by a LLVM MC Disassembler Protocol Buffer Fuzzer for the RISC-V assembly language. Reviewers: asb Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, edward-jones, mgrang, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, asb Differential Revision: https://reviews.llvm.org/D52005 llvm-svn: 342160
OpenPOWER on IntegriCloud