summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [Loop Predication] Teach LP about reverse loopsAnna Thomas2017-12-041-58/+135
| | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, we only support predication for forward loops with step of 1. This patch enables loop predication for reverse or countdownLoops, which satisfy the following conditions: 1. The step of the IV is -1. 2. The loop has a singe latch as B(X) = X <pred> latchLimit with pred as s> or u> 3. The IV of the guard is the decrement IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit). This patch was downstream for a while and is the last series of patches that's from our LP implementation downstream. Reviewers: apilipenko, mkazantsev, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40353 llvm-svn: 319659
* [NVPTX] Assign valid global namesJonas Hahnfeld2017-12-042-2/+19
| | | | | | | | | | | | | | | | | PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The existing pass already ensured this for globals and this patch adds the cleanup for functions with local linkage. However, there was a different problem in the case of collisions of the adjusted name: The ValueSymbolTable then automatically appended ".N" with increasing Ns to get a unique name while helping the ABI demangling. Special case this behavior to omit the dots and append N directly. This will always give us legal names according to the PTX requirements. Differential Revision: https://reviews.llvm.org/D40573 llvm-svn: 319657
* Revert r319649 - [Asm, ARM] Add fallback diag for multiple invalid operandsOliver Stannard2017-12-041-17/+0
| | | | | | | | This is causing a failure in the llvm-clang-x86_64-expensive-checks-win buildbot, and I can't reproduce it locally, so reverting until I can work out what is wrong. llvm-svn: 319654
* Revert "[ValueTracking] Pass only a single lambda to ↵Sam McCall2017-12-041-29/+37
| | | | | | | | | computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI" This reverts commit r319624, which seems to cause a miscompile (breaks the multistage PPC buildbots) llvm-svn: 319652
* AMDGPU: fix missing s_waitcntTim Corringham2017-12-041-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: The pass that inserts s_waitcnt instructions where needed propagated info used to track dependencies for each block by iterating over the predecessor blocks. The iteration was terminated when a predecessor that had not yet been processed was encountered. Any info in blocks later in the list was therefore not processed, leading to the possiblility of a required s_waitcnt not being inserted. The fix is simply to change the "break" to "continue" for the relevant loops, so that all visited blocks are processed. This is likely what was intended when the code was written. There is no test case provided for this fix because: 1) the only example that reproduces this is large and resistant to being reduced 2) the change is trivial Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D40544 llvm-svn: 319651
* [Asm, ARM] Add fallback diag for multiple invalid operandsOliver Stannard2017-12-041-0/+17
| | | | | | | | | | | | | | | | This adds a "invalid operands for instruction" diagnostic for instructions where there is an instruction encoding with the correct mnemonic and which is available for this target, but where multiple operands do not match those which were provided. This makes it clear that there is some combination of operands that is valid for the current target, which the default diagnostic of "invalid instruction" does not. Since this is a very general error, we only emit it if we don't have a more specific error. Differential revision: https://reviews.llvm.org/D36747 llvm-svn: 319649
* [TwoAddressInstructionPass] Bugfix in handling of sunk instructions.Jonas Paulsson2017-12-041-1/+10
| | | | | | | | | | | | | | | | | | An instruction returned by TII->convertToThreeAddress() may contain a %noreg (undef) operand, which is not expected by tryInstructionTransform(). So if this MI is sunk to a lower point in MBB, it must be skipped when later encountered. A new set SunkInstrs is used for this purpose. Note: there is no test supplied here, as this was triggered on SystemZ while working on a review of instruction flags. A test case for this bugfix will be included in the upcoming SystemZ commit. Review: Quentin Colombet https://reviews.llvm.org/D40711 llvm-svn: 319646
* [DAGCombine] Remove isAndLoadExtLoad argumentsSam Parker2017-12-041-14/+6
| | | | | | | | | Both LoadedVT and NarrowLoad are passed as references and neither of them are used by any of its callers. Differential Revision: https://reviews.llvm.org/D40713 llvm-svn: 319645
* [AArch64] Allow using emulated tls on platforms other than ELFMartin Storsjo2017-12-042-3/+9
| | | | | | | | | | | | | | | | This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Set the right Data*bitsDirective for windows to match the existing tests for other platforms. Make parts of the existing tests a regex, to allow matching .section .rdata for windows, to avoid having to duplicate the rest of the tests for windows. Differential Revision: https://reviews.llvm.org/D40770 llvm-svn: 319644
* [ARM] Allow using emulated tls on platforms other than ELFMartin Storsjo2017-12-041-4/+4
| | | | | | | | | | | This matches how it is done on X86. This allows using emulated tls on windows; in MinGW environments, native tls isn't supported at the moment. Differential Revision: https://reviews.llvm.org/D40769 llvm-svn: 319643
* [X86] Allow VPMAXUQ/VPMAXSQ/VPMINUQ/VPMINSQ to be used with 128/256 bit ↵Craig Topper2017-12-042-8/+50
| | | | | | | | vectors when AVX512 is enabled. These instructions can be used by widening to 512-bits and extracting back to 128/256. We do similar to several other instructions already. llvm-svn: 319641
* [X86] Don't turn UINT_TO_FP into SINT_TO_FP during lowering.Craig Topper2017-12-041-6/+0
| | | | | | We already do this as a DAG combine. The version during lowering can only trigger if known bits changes something that improves known bits analysis. But this means we should be improving known bits analysis to work on the unlowered form instead. llvm-svn: 319640
* [SelectionDAG] Teach computeKnownBits some improvements to ISD::SRL with a ↵Craig Topper2017-12-041-0/+19
| | | | | | | | | non-splat constant shift amount. If we have a non-splat constant shift amount, the minimum shift amount can be used to infer the number of zero upper bits of the result. There's probably a lot more that we can do here, but this fixes a case where I wanted to infer the sign bit as zero when all the shift amounts are non-zero. llvm-svn: 319639
* [X86][AVX512] Tag PH2PS/PS2PH conversion instructions scheduler classesSimon Pilgrim2017-12-031-25/+41
| | | | llvm-svn: 319637
* [X86][AVX512] Tag packed F2I/I2F/F2F conversion instructions scheduler classSimon Pilgrim2017-12-031-133/+163
| | | | llvm-svn: 319636
* [X86][SSE] Remove unused IIC_SSE_CVT_PI2PS_RR/IIC_SSE_CVT_PI2PS_RM itinerariesSimon Pilgrim2017-12-031-2/+0
| | | | llvm-svn: 319634
* CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr spaceYaxun Liu2017-12-031-7/+13
| | | | | | | | | | | SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is not true for amdgcn---amdgiz target. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40255 llvm-svn: 319630
* [SelectionDAG] Use the inlined APInt shift methods since we've already ↵Craig Topper2017-12-031-8/+11
| | | | | | | | bounds checked the shift. The version that takes APInt is out of line. The 'unsigned' version optimizes for the common case of single word APInts. llvm-svn: 319628
* Reland "[WebAssembly] Add visibility flag to Wasm symbol flags""Sam Clegg2017-12-033-4/+13
| | | | | | | | | | Original change was rL319488. This was reverted rL319602 due to a gcc 7.1 warning. Differential Revision: https://reviews.llvm.org/D40772 llvm-svn: 319626
* [ValueTracking] Pass only a single lambda to ↵Craig Topper2017-12-021-37/+29
| | | | | | computeKnownBitsFromShiftOperator by using KnownBits struct instead of separate APInts. NFCI llvm-svn: 319624
* CodeGen: Fix pointer info in ↵Yaxun Liu2017-12-024-31/+33
| | | | | | | | | | | | | | | | | | | | | SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT Two issues found when doing codegen for splitting vector with non-zero alloca addr space: DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to infer the correct pointer info, which ends up with a dummy pointer info for the target to lower store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to represent MachinePointerInfo which is known in alloca address space but without other information. TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for multiplication of index and then add it to the pointer. However the pointer may be in an addr space which has different size than addr space 0. The fix is to use the pointer value type for index multiplication. Differential Revision: https://reviews.llvm.org/D39758 llvm-svn: 319622
* [X86][SSE] Cleanup float/int conversion scheduler itinerary classesSimon Pilgrim2017-12-021-41/+79
| | | | | | Makes it easier to grok where each is supposed to be used, mainly useful for adding to the AVX512 instructions but hopefully can be used more in SSE/AVX as well. llvm-svn: 319614
* [X86] Teach the assembler to support %db8-%db15 as aliases for %dr8-%dr15.Craig Topper2017-12-021-13/+25
| | | | llvm-svn: 319612
* [X86] Support %dr8-%dr15 in the assembler.Craig Topper2017-12-021-1/+1
| | | | | | Apparently I failed to make this work when I fixed it in the disassembler way back in r224862. llvm-svn: 319611
* [ARC] Add instruction subset for the ARC backend.Tatyana Krasnukha2017-12-024-159/+990
| | | | | | | | | | | | Reviewers: petecoup, kparzysz Reviewed By: petecoup Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D37983 llvm-svn: 319609
* [DAG][AArch64] Disable post-legalization storeNirav Dave2017-12-021-0/+3
| | | | | | | Disable post-legalization store for AArch64 backend which is causing errors out-of-tree. llvm-svn: 319607
* [WebAssembly] Revert r319488 "Add visibility flag to Wasm symbol flags"Heejin Ahn2017-12-023-13/+4
| | | | | | | | | This patch reportedly broke one of LLVM bots (ubuntu-gcc7.1-werror). See http://lab.llvm.org:8011/builders/ubuntu-gcc7.1-werror/builds/3369 for details. llvm-svn: 319602
* Revert "[X86] Improvement in CodeGen instruction selection for LEAs."Matt Morehouse2017-12-013-574/+11
| | | | | | This reverts r319543, due to ASan bot breakage. llvm-svn: 319591
* [MachineOutliner] NFC: Throw out self-intersections on candidates earlyJessica Paquette2017-12-011-11/+42
| | | | | | | | | | | | | | | Currently, the outliner considers candidates that intersect with themselves in the candidate pruning step. That is, candidates of the form "AA" in ranges like "AAAAAA". In that range, it looks like there are 5 instances of "AA" that could possibly be outlined, and that's considered in the benefit calculation. However, only at most 3 instances of "AA" could ever be outlined in "AAAAAA". Thus, it's possible to pass through "AA" to the candidate selection step even though it's *never* the case that "AA" could be outlined. This makes it so that when we find candidates, we consider only non-overlapping occurrences of that candidate. llvm-svn: 319588
* [DAG][ARM] Revert "Reenable post-legalize store merge"Nirav Dave2017-12-012-11/+8
| | | | | | due to failures in AArch and ARM code gen. llvm-svn: 319587
* [MC] Handle unknown literal register numbers in .cfi_* directivesJake Ehrlich2017-12-013-10/+43
| | | | | | | | | | | | | | | | | r230670 introduced a step to map EH register numbers to standard DWARF register numbers. This failed to consider the case when a user .cfi_* directive uses an integer literal rather than a register name, to specify a DWARF register number that has no corresponding LLVM register number (e.g. a special register that the compiler and assembler have no name for). Fixes PR34028. Patch by Roland McGrath Differential Revision: https://reviews.llvm.org/D36493 llvm-svn: 319586
* [IndVars] Fix a bug introduced in r317012Philip Reames2017-12-011-3/+13
| | | | | | | | Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant. In this case, there is no loop invariant value contributing and we'd fail an assert. The test case was found by a java fuzzer and reduced. It's a real cornercase. You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate. To my knowledge, only the fuzzer has hit this case. llvm-svn: 319583
* [opt-remarks] If hotness threshold is set, ignore remarks without hotnessAdam Nemet2017-12-012-9/+7
| | | | | | | | | | | | | | These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 Re-commit after fixing the failing testcase in rL319576, rL319577 and rL319578. llvm-svn: 319581
* [DAGCombine] Simplify ISD::AND handling in ReduceLoadWidthEli Friedman2017-12-011-20/+5
| | | | | | | | Followup to D39595. Removes a bunch of redundant checks. Differential Revision: https://reviews.llvm.org/D40667 llvm-svn: 319573
* [X86][AVX512] Tag subvector extract/insert instructions scheduler classesSimon Pilgrim2017-12-011-32/+65
| | | | llvm-svn: 319568
* [IR] Avoid dangling else warning. NFC.Benjamin Kramer2017-12-011-1/+2
| | | | llvm-svn: 319567
* IR printing improvement for loop passes - handle -print-module-scopeFedor Sergeev2017-12-011-0/+12
| | | | | | | | | | | | | | | | | Summary: Adding support for -print-module-scope similar to how it is being done for function passes. This option causes loop-pass printer to emit a whole-module IR instead of just a loop itself. Reviewers: sanjoy, silvas, weimingz Reviewed By: sanjoy Subscribers: apilipenko, skatkov, llvm-commits Differential Revision: https://reviews.llvm.org/D40247 llvm-svn: 319566
* [DebugInfo] Bail out if making no progress dumping line tables.Paul Robinson2017-12-011-0/+4
| | | | llvm-svn: 319564
* Revert "[opt-remarks] If hotness threshold is set, ignore remarks without ↵Adam Nemet2017-12-012-7/+9
| | | | | | | | | | | hotness" This reverts commit r319556. Something is not working with this when used with sample-based profiling. Investigating... llvm-svn: 319562
* IR printing improvement for function passes - introducing -print-module-scopeFedor Sergeev2017-12-012-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When debugging function passes it happens to be rather useful to dump the whole module before the transformation and then use this dump to analyze this single transformation by running it separately on that particular module state. Introducing -print-module-scope debugging option that forces all the function-level IR dumps to become whole-module dumps. This option builds on top of normal dumping controls like -print-before/after -filter-print-funcs The plan is to eventually extend this option to cover other local passes (at least loop passes) but that should go as a separate change. Reviewers: sanjoy, weimingz, silvas, fedor.sergeev Reviewed By: weimingz Subscribers: apilipenko, skatkov, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D40245 llvm-svn: 319561
* Fix line endings. NFCI.Simon Pilgrim2017-12-011-10/+10
| | | | llvm-svn: 319559
* [X86][AVX512] Tag VPERM2I/VPERM2T instructions scheduler classSimon Pilgrim2017-12-011-48/+64
| | | | llvm-svn: 319558
* [opt-remarks] If hotness threshold is set, ignore remarks without hotnessAdam Nemet2017-12-012-9/+7
| | | | | | | | | | | These are blocks that haven't not been executed during training. For large projects this could make a significant difference. For the project, I was looking at, I got an order of magnitude decrease in the size of the total YAML files with this and r319235. Differential Revision: https://reviews.llvm.org/D40678 llvm-svn: 319556
* [X86][AVX512] Tag VFPCLASS instructions scheduler classSimon Pilgrim2017-12-011-26/+43
| | | | llvm-svn: 319554
* [X86][AVX512] Tag VPSHUFBITQMB instructions scheduler classSimon Pilgrim2017-12-011-9/+12
| | | | llvm-svn: 319553
* [X86][AVX512] Tag VPCOMRESS/VPEXPAND instructions scheduler classesSimon Pilgrim2017-12-011-39/+55
| | | | llvm-svn: 319551
* Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' ↵Hans Wennborg2017-12-011-343/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | elements in integer binary ops." It causes builds to fail with "Instruction does not dominate all uses" (PR35497). > Patch tries to improve vectorization of the following code: > > void add1(int * __restrict dst, const int * __restrict src) { > *dst++ = *src++; > *dst++ = *src++ + 1; > *dst++ = *src++ + 2; > *dst++ = *src++ + 3; > } > Allows to vectorize even if the very first operation is not a binary add, but just a load. > > Fixed issues related to previous commit. > > Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev > > Reviewed By: ABataev, RKSimon > > Subscribers: llvm-commits, RKSimon > > Differential Revision: https://reviews.llvm.org/D28907 llvm-svn: 319550
* [ARM][DAG] Reenable post-legalize store mergeNirav Dave2017-12-012-8/+11
| | | | | | | | | | | | Summary: Reenable post-legalize stores with constant merging computation and cofrresponding test case. Reviewers: eastig, efriedma Subscribers: aemerson, javed.absar, kristof.beyls, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D40701 llvm-svn: 319547
* [X86] Improvement in CodeGen instruction selection for LEAs.Jatin Bhateja2017-12-013-11/+574
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: 1/ Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to accommodate similar operand appearing in the DAG e.g. T1 = A + B T2 = T1 + 10 T3 = T2 + A For above DAG rooted at T3, X86AddressMode will now look like Base = B , Index = A , Scale = 2 , Disp = 10 2/ During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity then complex LEAs (having 3 operands) could be factored out e.g. leal 1(%rax,%rcx,1), %rdx leal 1(%rax,%rcx,2), %rcx will be factored as following leal 1(%rax,%rcx,1), %rdx leal (%rdx,%rcx) , %edx 3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop. 4/ Simplify LEA converts (lea (BASE,1,INDEX,0) --> add (BASE, INDEX) which offers better through put. PR32755 will be taken care of by this pathc. Previous patch revisions : r313343 , r314886 Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja Reviewed By: lsaba, RKSimon, jbhateja Subscribers: jmolloy, spatel, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D35014 llvm-svn: 319543
* [X86][AVX512] Tag vshift/vpermv/pshufd/pshufb instructions scheduler classesSimon Pilgrim2017-12-012-120/+158
| | | | llvm-svn: 319540
OpenPOWER on IntegriCloud