summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [ValueTracking] Pass only a single lambda to ↵Craig Topper2017-12-021-37/+29
| | | | | | computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI llvm-svn: 319624
* CodeGen: Fix pointer info in ↵Yaxun Liu2017-12-024-31/+33
| | | | | | | | | | | | | | | | | | | | | SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT Two issues found when doing codegen for splitting vector with non-zero alloca addr space: DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to infer the correct pointer info, which ends up with a dummy pointer info for the target to lower store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to represent MachinePointerInfo which is known in alloca address space but without other information. TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for multiplication of index and then add it to the pointer. However the pointer may be in an addr space which has different size than addr space 0. The fix is to use the pointer value type for index multiplication. Differential Revision: https://reviews.llvm.org/D39758 llvm-svn: 319622
* [X86][SSE] Cleanup float/int conversion scheduler itinerary classesSimon Pilgrim2017-12-021-41/+79
| | | | | | Makes it easier to grok where each is supposed to be used, mainly useful for adding to the AVX512 instructions but hopefully can be used more in SSE/AVX as well. llvm-svn: 319614
* [X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15.Craig Topper2017-12-021-13/+25
| | | | llvm-svn: 319612
* [X86] Support %dr8-%dr15 in the assembler.Craig Topper2017-12-021-1/+1
| | | | | | Apparently I failed to make this work when I fixed it in the disassembler way back in r224862. llvm-svn: 319611
* [ARC] Add instruction subset for the ARC backend.Tatyana Krasnukha2017-12-024-159/+990
| | | | | | | | | | | | Reviewers: petecoup, kparzysz Reviewed By: petecoup Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37983 llvm-svn: 319609
* [DAG][AArch64] Disable post-legalization storeNirav Dave2017-12-021-0/+3
| | | | | | | Disable post-legalization store for AArch64 backend which is causing errors out-of-tree. llvm-svn: 319607
* [WebAssembly] Revert r319488 "Add visibility flag to Wasm symbol flags"Heejin Ahn2017-12-023-13/+4
| | | | | | | | | This patch reportedly broke one of LLVM bots (ubuntu-gcc7.1-werror). See http://lab.llvm.org:8011/builders/ubuntu-gcc7.1-werror/builds/3369 for details. llvm-svn: 319602
* Revert "[X86] Improvement in CodeGen instruction selection for LEAs."Matt Morehouse2017-12-013-574/+11
| | | | | | This reverts r319543, due to ASan bot breakage. llvm-svn: 319591
* [MachineOutliner] NFC: Throw out self-intersections on candidates earlyJessica Paquette2017-12-011-11/+42
| | | | | | | | | | | | | | | Currently, the outliner considers candidates that intersect with themselves in the candidate pruning step. That is, candidates of the form "AA" in ranges like "AAAAAA". In that range, it looks like there are 5 instances of "AA" that could possibly be outlined, and that's considered in the benefit calculation. However, only at most 3 instances of "AA" could ever be outlined in "AAAAAA". Thus, it's possible to pass through "AA" to the candidate selection step even though it's *never* the case that "AA" could be outlined. This makes it so that when we find candidates, we consider only non-overlapping occurrences of that candidate. llvm-svn: 319588
* [DAG][ARM] Revert "Reenable post-legalize store merge"Nirav Dave2017-12-012-11/+8
| | | | | | due to failures in AArch and ARM code gen. llvm-svn: 319587
* [MC] Handle unknown literal register numbers in .cfi_* directivesJake Ehrlich2017-12-013-10/+43
| | | | | | | | | | | | | | | | | r230670 introduced a step to map EH register numbers to standard DWARF register numbers. This failed to consider the case when a user .cfi_* directive uses an integer literal rather than a register name, to specify a DWARF register number that has no corresponding LLVM register number (e.g. a special register that the compiler and assembler have no name for). Fixes PR34028. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D36493 llvm-svn: 319586
* [IndVars] Fix a bug introduced in r317012Philip Reames2017-12-011-3/+13
| | | | | | | | Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant. In this case, there is no loop invariant value contributing and we'd fail an assert. The test case was found by a java fuzzer and reduced. It's a real cornercase. You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate. To my knowledge, only the fuzzer has hit this case. llvm-svn: 319583
* [opt-remarks] If hotness threshold is set, ignore remarks without hotnessAdam Nemet2017-12-012-9/+7
| | | | | | | | | | | | | | These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 Re-commit after fixing the failing testcase in rL319576, rL319577 and rL319578. llvm-svn: 319581
* [DAGCombine] Simplify ISD::AND handling in ReduceLoadWidthEli Friedman2017-12-011-20/+5
| | | | | | | | Followup to D39595. Removes a bunch of redundant checks. Differential Revision: https://reviews.llvm.org/D40667 llvm-svn: 319573
* [X86][AVX512] Tag subvector extract/insert instructions scheduler classesSimon Pilgrim2017-12-011-32/+65
| | | | llvm-svn: 319568
* [IR] Avoid dangling else warning. NFC.Benjamin Kramer2017-12-011-1/+2
| | | | llvm-svn: 319567
* IR printing improvement for loop passes - handle -print-module-scopeFedor Sergeev2017-12-011-0/+12
| | | | | | | | | | | | | | | | | Summary: Adding support for -print-module-scope similar to how it is being done for function passes. This option causes loop-pass printer to emit a whole-module IR instead of just a loop itself. Reviewers: sanjoy, silvas, weimingz Reviewed By: sanjoy Subscribers: apilipenko, skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D40247 llvm-svn: 319566
* [DebugInfo] Bail out if making no progress dumping line tables.Paul Robinson2017-12-011-0/+4
| | | | llvm-svn: 319564
* Revert "[opt-remarks] If hotness threshold is set, ignore remarks without ↵Adam Nemet2017-12-012-7/+9
| | | | | | | | | | | hotness" This reverts commit r319556. Something is not working with this when used with sample-based profiling. Investigating... llvm-svn: 319562
* IR printing improvement for function passes - introducing -print-module-scopeFedor Sergeev2017-12-012-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When debugging function passes it happens to be rather useful to dump the whole module before the transformation and then use this dump to analyze this single transformation by running it separately on that particular module state. Introducing -print-module-scope debugging option that forces all the function-level IR dumps to become whole-module dumps. This option builds on top of normal dumping controls like -print-before/after -filter-print-funcs The plan is to eventually extend this option to cover other local passes (at least loop passes) but that should go as a separate change. Reviewers: sanjoy, weimingz, silvas, fedor.sergeev Reviewed By: weimingz Subscribers: apilipenko, skatkov, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D40245 llvm-svn: 319561
* Fix line endings. NFCI.Simon Pilgrim2017-12-011-10/+10
| | | | llvm-svn: 319559
* [X86][AVX512] Tag VPERM2I/VPERM2T instructions scheduler classSimon Pilgrim2017-12-011-48/+64
| | | | llvm-svn: 319558
* [opt-remarks] If hotness threshold is set, ignore remarks without hotnessAdam Nemet2017-12-012-9/+7
| | | | | | | | | | | These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 llvm-svn: 319556
* [X86][AVX512] Tag VFPCLASS instructions scheduler classSimon Pilgrim2017-12-011-26/+43
| | | | llvm-svn: 319554
* [X86][AVX512] Tag VPSHUFBITQMB instructions scheduler classSimon Pilgrim2017-12-011-9/+12
| | | | llvm-svn: 319553
* [X86][AVX512] Tag VPCOMRESS/VPEXPAND instructions scheduler classesSimon Pilgrim2017-12-011-39/+55
| | | | llvm-svn: 319551
* Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' ↵Hans Wennborg2017-12-011-343/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | elements in integer binary ops." It causes builds to fail with "Instruction does not dominate all uses" (PR35497). > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > *dst++ = *src++; > *dst++ = *src++ + 1; > *dst++ = *src++ + 2; > *dst++ = *src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319550
* [ARM][DAG] Reenable post-legalize store mergeNirav Dave2017-12-012-8/+11
| | | | | | | | | | | | Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case. Reviewers: eastig, efriedma Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40701 llvm-svn: 319547
* [X86] Improvement in CodeGen instruction selection for LEAs.Jatin Bhateja2017-12-013-11/+574
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will now look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. 4/ Simplify LEA converts (lea (BASE,1,INDEX,0) --> add (BASE, INDEX) which offers better through put. PR32755 will be taken care of by this pathc. Previous patch revisions : r313343 , r314886 Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja Reviewed By: lsaba, RKSimon, jbhateja Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 319543
* [X86][AVX512] Tag vshift/vpermv/pshufd/pshufb instructions scheduler classesSimon Pilgrim2017-12-012-120/+158
| | | | llvm-svn: 319540
* Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values.Mikael Holmen2017-12-011-1/+1
| | | | | | Broke build bots so reverting. llvm-svn: 319539
* [InstSimplify] More fcmp cases when comparing against negative constants.Florian Hahn2017-12-011-0/+22
| | | | | | | | | | | | | | | | | | | | | | | | Summary: For known positive non-zero value X: fcmp uge X, -C => true fcmp ugt X, -C => true fcmp une X, -C => true fcmp oeq X, -C => false fcmp ole X, -C => false fcmp olt X, -C => false Patch by Paul Walker. Reviewers: majnemer, t.p.northover, spatel, RKSimon Reviewed By: spatel Subscribers: fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D40012 llvm-svn: 319538
* Bail out of a SimplifyCFG switch table opt at undef values.Mikael Holmen2017-12-011-1/+1
| | | | | | | | | | | | | | | | | | | Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by: JesperAntonsson Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40639 llvm-svn: 319537
* Follow-up to r319434 to turn the pass on by defaultNemanja Ivanovic2017-12-011-1/+1
| | | | | | | Now that the patch has gone through the buildbot cycle, turn it on by default. llvm-svn: 319535
* [AMDGPU] SiFixSGPRCopies should not modify non-divergent PHIAlexander Timofeev2017-12-011-15/+64
| | | | | | Differential revision: https://reviews.llvm.org/D40556 llvm-svn: 319534
* [cmake] Enable zlib support on windowsPavel Labath2017-12-011-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Summary: zlib support was hard-wired to off for (non-cygwin) windows targets. This disables some features, such as reading debug info from compressed dwarf sections. This has been this way since zlib support was added in 2013 (r180083), but there is no obvious reason for that. Zlib is perfectly capable of being compiled for windows (it even has a cmake file that works out of the box). This enables one to turn on zlib support on windows, if one has zlib avaliable. Reviewers: rnk, beanz Subscribers: mgorny, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D40655 llvm-svn: 319533
* [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in ↵Dinar Temirbulatov2017-12-011-143/+343
| | | | | | | | | | | | | | | | | | | | | | | | | | | integer binary ops. Patch tries to improve vectorization of the following code: void add1(int * __restrict dst, const int * __restrict src) { *dst++ = *src++; *dst++ = *src++ + 1; *dst++ = *src++ + 2; *dst++ = *src++ + 3; } Allows to vectorize even if the very first operation is not a binary add, but just a load. Fixed issues related to previous commit. Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev Reviewed By: ABataev, RKSimon Subscribers: llvm-commits, RKSimon Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319531
* GlobalISel: Enable the legalization of G_MERGE_VALUES and G_UNMERGE_VALUESVolkan Keles2017-12-014-8/+113
| | | | | | | | | | | | | | Summary: LegalizerInfo assumes all G_MERGE_VALUES and G_UNMERGE_VALUES instructions are legal, so it is not possible to legalize vector operations on illegal vector types. This patch fixes the problem by removing the related check and adding default actions for G_MERGE_VALUES and G_UNMERGE_VALUES. Reviewers: qcolombet, ab, dsanders, aditya_nandakumar, t.p.northover, kristof.beyls Reviewed By: dsanders Subscribers: rovka, javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D39823 llvm-svn: 319524
* Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and storesHiroshi Inoue2017-12-011-18/+50
| | | | | | Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots. llvm-svn: 319522
* [X86] Custom legalize v2i32 gathers via widening rather than promoting.Craig Topper2017-12-011-33/+57
| | | | | | | | The default legalization for v2i32 is promotion to v2i64. This results in a gather that reads 64-bit elements rather than 32. If one of the elements is near a page boundary this can cause an illegal access that can fault. We also miscalculate the scale for the gather which is an even worse problem, but we probably could have found a separate way to fix that. llvm-svn: 319521
* [X86][SelectionDAG] Make sure we explicitly sign extend the index when type ↵Craig Topper2017-12-011-0/+6
| | | | | | | | promoting the index of scatter and gather. Type promotion makes no guarantee about the contents of the promoted bits. Since the gather/scatter instruction will use the bits to calculate addresses, we need to ensure they aren't garbage. llvm-svn: 319520
* [X86] Add a DAG combine to simplify masks for AVX2 gather instructions.Craig Topper2017-12-011-0/+17
| | | | | | AVX2 gathers only use the upper bit of the mask allowing us to simplify sign_extend_inreg to a shift left. llvm-svn: 319514
* Add flag to ArchiveWriter to test GNU64 format more efficientlyJake Ehrlich2017-12-011-1/+10
| | | | | | | | | | | | | | | | | | | | | | | Even with the sparse file optimizations the SYM64 test can still be painfully slow. This unnecessarily slows down devs. It's critical that we test that the switch to the SYM64 format occurs at 4GB but there isn't any better of a way to fake the size of the file than sparse files. This change introduces a flag that allows the cutoff to be arbitrarily set to whatever power of two is desired. The flag is hidden as it really isn't meant to be used outside this one test. This is unfortunate but appears necessary, at least until the average hard drive is much faster. The changes to the test require some explanation. Prior to this change we knew that the SYM64 format was being used because the file was simply too large to have validly handled this case if the SYM64 format were not used. To ensure that the SYM64 format is still being used I am grepping the file for "SYM64". Without changing the filename however this would be pointless because "SYM64" would occur in the file either way. So the filename of the test is also changed in order to avoid this issue. Differential Revision: https://reviews.llvm.org/D40632 llvm-svn: 319507
* Mark all library options as hidden.Zachary Turner2017-12-0124-85/+85
| | | | | | | | | | | | | | | | | These command line options are not intended for public use, and often don't even make sense in the context of a particular tool anyway. About 90% of them are already hidden, but when people add new options they forget to hide them, so if you were to make a brand new tool today, link against one of LLVM's libraries, and run tool -help you would get a bunch of junk that doesn't make sense for the tool you're writing. This patch hides these options. The real solution is to not have libraries defining command line options, but that's a much larger effort and not something I'm prepared to take on. Differential Revision: https://reviews.llvm.org/D40674 llvm-svn: 319505
* AMDGPU: Use carry-less adds in FI eliminationMatt Arsenault2017-11-302-9/+6
| | | | llvm-svn: 319501
* ThinLTOBitcodeWriter: Try harder to discard unused references to the merged ↵Peter Collingbourne2017-11-301-3/+11
| | | | | | | | | | | | | | | | | | | module. If the thin module has no references to an internal global in the merged module, we need to make sure to preserve that property if the global is a member of a comdat group, as otherwise promotion can end up adding global symbols to the comdat, which is not allowed. This situation can arise if the external global in the thin module has dead constant users, which would cause use_empty() to return false and would cause us to try to promote it. To prevent this from happening, discard the dead constant users before asking whether a global is empty. Differential Revision: https://reviews.llvm.org/D40593 llvm-svn: 319494
* Simplify the DenseSet used for hashing CodeView records.Zachary Turner2017-11-301-96/+44
| | | | | | | | | | | | | | This was storing the hash alongside the key so that the hash doesn't need to be re-computed every time, but in doing so it was allocating a structure to keep the key size small in the DenseMap. This is a noble goal, but it also leads to a pointer indirection on every probe, and this cost of this pointer indirection ends up being higher than the cost of having a slightly larger entry in the hash table. Removing this not only simplifies the code, but yields a small but noticeable performance improvement in the type merging algorithm. llvm-svn: 319493
* AMDGPU: Use gfx9 carry-less add/sub instructionsMatt Arsenault2017-11-306-34/+118
| | | | llvm-svn: 319491
* XOR the frame pointer with the stack cookie when protecting the stackReid Kleckner2017-11-306-5/+55
| | | | | | | | | | | | Summary: This strengthens the guard and matches MSVC. Reviewers: hans, etienneb Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits Differential Revision: https://reviews.llvm.org/D40622 llvm-svn: 319490
OpenPOWER on IntegriCloud