summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [CVP] prevent propagating poison when substituting edge values into a phi ↵Sanjay Patel2019-10-281-1/+8
| | | | | | | | | | | | | | | | | (PR43802) This phi simplification transform was added with: D45448 However as shown in PR43802: https://bugs.llvm.org/show_bug.cgi?id=43802 ...we must be careful not to propagate poison when we do the substitution. There might be some more complicated analysis possible to retain the overflow flag, but it should always be safe and easy to drop flags (we have similar behavior in instcombine and other passes). Differential Revision: https://reviews.llvm.org/D69442
* Use isConvergent helper instead of directly checking attributeMatt Arsenault2019-10-272-2/+2
|
* [Alignment][NFC] Convert AllocaInst to MaybeAlignGuillaume Chatelet2019-10-253-18/+18
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69301
* [LLD][ThinLTO] Handle GUID collision in import global processingTeresa Johnson2019-10-251-5/+11
| | | | | | | | | | | | | | | | | | | | | | Summary: If there are a GUID collision between two globals checking the summarylist from the import index to make assumption can be dangerous. Do not assume that a GlobalValue that has a GlobalVarSummary actually is a GlobalVariable as it can be another GlobalValue with the same GUID that the summary is connected to. Patch by Joel Klinghed (the_jk@opera.com) Reviewers: evgeny777, tejohnson Reviewed By: tejohnson Subscribers: tejohnson, dblaikie, MaskRay, mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67322
* [Alignment][NFC] getMemoryOpCost uses MaybeAlignGuillaume Chatelet2019-10-252-11/+7
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69307
* [SLP] adjust code comment; NFCSanjay Patel2019-10-251-1/+1
| | | | (check commit access)
* [SCEV] Expose and use maximum constant exit counts for individual loop exitsPhilip Reames2019-10-241-3/+3
| | | | | | | | We were already going to all of the trouble of computing maximum constant exit counts for each loop exit, we might as well expose them through the API. The change in IndVars is mostly to demonstrate that the wired up code works, but it als very slightly strengthens the transform. The strengthened case is rather narrow though: it requires one exactly analyzeable exit, one imprecisely analyzeable exit (with the upper bound less than the precise one), and one unanalyzeable exit. I coudn't construct a reasonably stable test case. This does increase the memory usage of the BackedgeTakenCount by a factor of 2 in the worst case. I also noticed the loop in IndVars is O(#Exits ^ 2). This doesn't change with this patch. A future patch will cache this result inside of SCEV to avoid requering.
* Test commit access via gitPhilip Reames2019-10-241-1/+0
|
* [InstCombine] Fold one-use variable into assertBenjamin Kramer2019-10-241-2/+1
| | | | Avoids warnings in Release builds. NFC.
* [InstCombine] Known-bits optimization for ARM MVE VADC.Simon Tatham2019-10-241-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The MVE VADC instruction reads and writes the carry bit at bit 29 of the FPSCR register. The corresponding ACLE intrinsic is specified to work with an integer in which the carry bit is stored at bit 0. So if a user writes a code sequence in C that passes the carry from one VADC to the next, like this, s0 = vadcq_u32(a0, b0, &carry); s1 = vadcq_u32(a1, b1, &carry); then clang will generate IR for each of those operations that shifts the carry bit up into bit 29 before the VADC, and after it, shifts it back down and masks off all but the low bit. But in this situation what you really wanted was two consecutive VADC instructions, so that the second one directly reads the value left in FPSCR by the first, without wasting several instructions on pointlessly clearing the other flag bits in between. This commit explains to InstCombine that the other bits of the flags operand don't matter, and adds a test that demonstrates that all the code between the two VADC instructions can be optimized away as a result. Reviewers: dmgreen, miyuki, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67162
* [InstCombine] Signed saturation patternsDavid Green2019-10-222-0/+68
| | | | | | | | | | | | This adds an instcombine matcher for code that attempts to perform signed saturating arithmetic by casting to a higher type. Unsigned cases are already matched, this adds extra matches for the more complex signed cases, which involves matching the min(max(add a b)) nodes with proper extends to ensure legality. Differential Revision: https://reviews.llvm.org/D68651 llvm-svn: 375505
* [ThinLTO] Add code comment. NFCEugene Leviant2019-10-221-0/+3
| | | | llvm-svn: 375500
* [Alignment][NFC] Convert StoreInst to MaybeAlignGuillaume Chatelet2019-10-224-9/+11
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69303 llvm-svn: 375499
* [Alignment][NFC] Convert LoadInst to MaybeAlignGuillaume Chatelet2019-10-226-15/+16
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69302 llvm-svn: 375498
* [ThinLTO] Don't internalize during promotionEugene Leviant2019-10-221-0/+1
| | | | | | Differential revision: https://reviews.llvm.org/D69107 llvm-svn: 375493
* [CVP] No-wrap deduction for `shl`Roman Lebedev2019-10-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the last `OverflowingBinaryOperator` for which we don't deduce flags. D69217 taught `ConstantRange::makeGuaranteedNoWrapRegion()` about it. The effect is better than of the `mul` patch (D69203): | statistic | old | new | delta | % change | | correlated-value-propagation.NumAddNUW | 7145 | 7144 | -1 | -0.0140% | | correlated-value-propagation.NumAddNW | 12126 | 12125 | -1 | -0.0082% | | correlated-value-propagation.NumAnd | 443 | 446 | 3 | 0.6772% | | correlated-value-propagation.NumNSW | 5986 | 7158 | 1172 | 19.5790% | | correlated-value-propagation.NumNUW | 10512 | 13304 | 2792 | 26.5601% | | correlated-value-propagation.NumNW | 16498 | 20462 | 3964 | 24.0272% | | correlated-value-propagation.NumShlNSW | 0 | 1172 | 1172 | | | correlated-value-propagation.NumShlNUW | 0 | 2793 | 2793 | | | correlated-value-propagation.NumShlNW | 0 | 3965 | 3965 | | | instcount.NumAShrInst | 13824 | 13790 | -34 | -0.2459% | | instcount.NumAddInst | 277584 | 277586 | 2 | 0.0007% | | instcount.NumAndInst | 66061 | 66056 | -5 | -0.0076% | | instcount.NumBrInst | 709153 | 709147 | -6 | -0.0008% | | instcount.NumICmpInst | 483709 | 483708 | -1 | -0.0002% | | instcount.NumSExtInst | 79497 | 79496 | -1 | -0.0013% | | instcount.NumShlInst | 40691 | 40654 | -37 | -0.0909% | | instcount.NumSubInst | 61997 | 61996 | -1 | -0.0016% | | instcount.NumZExtInst | 68208 | 68211 | 3 | 0.0044% | | instcount.TotalBlocks | 843916 | 843910 | -6 | -0.0007% | | instcount.TotalInsts | 7387528 | 7387448 | -80 | -0.0011% | Reviewers: nikic, reames, sanjoy, timshen Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69277 llvm-svn: 375455
* Prune Pass.h include from DataLayout.h. NFCIBjorn Pettersson2019-10-211-0/+1
| | | | | | | | | | | | | | | | | | | Summary: Reduce include dependencies by no longer including Pass.h from DataLayout.h. That include seemed irrelevant to DataLayout, as well as being irrelevant to several users of DataLayout. Reviewers: rnk Reviewed By: rnk Subscribers: mehdi_amini, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69261 llvm-svn: 375436
* GVNHoist - silence static analyzer dyn_cast<> null dereference warning in ↵Simon Pilgrim2019-10-211-1/+1
| | | | | | | | hasEHOrLoadsOnPath call. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375429
* GuardWidening - silence static analyzer null dereference warning with ↵Simon Pilgrim2019-10-211-1/+1
| | | | | | assertion. NFCI. llvm-svn: 375428
* CrossDSOCFI - silence static analyzer dyn_cast<> null dereference warning. NFCI.Simon Pilgrim2019-10-211-1/+1
| | | | | | The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375427
* IndVarSimplify - silence static analyzer dyn_cast<> null dereference ↵Simon Pilgrim2019-10-211-2/+2
| | | | | | | | warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375426
* [Alignment][NFC] Finish transition for `Loads`Guillaume Chatelet2019-10-216-20/+22
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69253 llvm-svn: 375419
* [Alignment][NFC] Instructions::getLoadStoreAlignmentGuillaume Chatelet2019-10-212-41/+47
| | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69256 llvm-svn: 375416
* [MemCpyOpt] Fixing Incorrect Code Motion while Handling Aggregate Type ValuesSam Elliott2019-10-211-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When MemCpyOpt is handling aggregate type values, if an instruction (let's call it P) between the targeting load (L) and store (S) clobbers the source pointer of L, it will try to hoist S before P. This process will also hoist S's data dependency instructions. However, the current implementation has a bug that if one of S's dependency instructions is //also// a user of P, MemCpyOpt will not prevent it from being hoisted above P and cause a use-before-define error. For example, in the newly added test file (i.e. `aggregate-type-crash.ll`), it will try to hoist both `store %my_struct %1, %my_struct* %3` and its dependent, `%3 = bitcast i8* %2 to %my_struct*`, above `%2 = call i8* @my_malloc(%my_struct* %0)`. Creating the following BB: ``` entry: %1 = bitcast i8* %4 to %my_struct* %2 = bitcast %my_struct* %1 to i8* %3 = bitcast %my_struct* %0 to i8* call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %2, i8* align 4 %3, i64 8, i1 false) %4 = call i8* @my_malloc(%my_struct* %0) ret void ``` Where there is a use-before-define error between `%1` and `%4`. Update: The compiler for the Pony Programming Language [also encounter the same bug](https://github.com/ponylang/ponyc/issues/3140) Patch by Min-Yih Hsu (myhsu) Reviewers: eugenis, pcc, dblaikie, dneilson, t.p.northover, lattner Reviewed By: eugenis Subscribers: lenary, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66060 llvm-svn: 375403
* [NFC][InstCombine] Fixup commentsRoman Lebedev2019-10-211-2/+2
| | | | | | As noted in post-commit review of rL375378375378. llvm-svn: 375397
* [CVP] Deduce no-wrap on `mul`Roman Lebedev2019-10-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: `ConstantRange::makeGuaranteedNoWrapRegion()` knows how to deal with `mul` since rL335646, there is exhaustive test coverage. This is already used by CVP's `processOverflowIntrinsic()`, and by SCEV's `StrengthenNoWrapFlags()` That being said, currently, this doesn't help much in the end: | statistic | old | new | delta | percentage | | correlated-value-propagation.NumMulNSW | 4 | 275 | 271 | 6775.00% | | correlated-value-propagation.NumMulNUW | 4 | 1323 | 1319 | 32975.00% | | correlated-value-propagation.NumMulNW | 8 | 1598 | 1590 | 19875.00% | | correlated-value-propagation.NumNSW | 5715 | 5986 | 271 | 4.74% | | correlated-value-propagation.NumNUW | 9193 | 10512 | 1319 | 14.35% | | correlated-value-propagation.NumNW | 14908 | 16498 | 1590 | 10.67% | | instcount.NumAddInst | 275871 | 275869 | -2 | 0.00% | | instcount.NumBrInst | 708234 | 708232 | -2 | 0.00% | | instcount.NumMulInst | 43812 | 43810 | -2 | 0.00% | | instcount.NumPHIInst | 316786 | 316784 | -2 | 0.00% | | instcount.NumTruncInst | 62165 | 62167 | 2 | 0.00% | | instcount.NumUDivInst | 2528 | 2526 | -2 | -0.08% | | instcount.TotalBlocks | 842995 | 842993 | -2 | 0.00% | | instcount.TotalInsts | 7376486 | 7376478 | -8 | 0.00% | (^ test-suite plain, tests still pass) Reviewers: nikic, reames, luqmana, sanjoy, timshen Reviewed By: reames Subscribers: hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69203 llvm-svn: 375396
* [InstCombine] Allow values with multiple users in SimplifyDemandedVectorEltsPiotr Sobczak2019-10-213-27/+115
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Allow for ignoring the check for a single use in SimplifyDemandedVectorElts to be able to simplify operands if DemandedElts is known to contain the union of elements used by all users. It is a responsibility of a caller of SimplifyDemandedVectorElts to supply correct DemandedElts. Simplify a series of extractelement instructions if only a subset of elements is used. Reviewers: reames, arsenm, majnemer, nhaehnle Reviewed By: nhaehnle Subscribers: wdng, jvesely, nhaehnle, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67345 llvm-svn: 375395
* [Attributor][FIX] Silence sign-compare warningJohannes Doerfert2019-10-211-1/+1
| | | | llvm-svn: 375384
* [Attributor] Teach AANoCapture to use information in-flight more aggressivelyJohannes Doerfert2019-10-211-8/+63
| | | | | | | | | AAReturnedValues, AAMemoryBehavior, and AANoUnwind, can provide information that helps during the tracking or even justifies no-capture. We now use this information and enable no-capture in some test cases designed a long while a ago for these cases. llvm-svn: 375382
* [IndVars] Add a todo to reflect a further oppurtunity identified in D69009Philip Reames2019-10-201-0/+7
| | | | | | Nikita pointed out an oppurtunity, might as well document it in the code. llvm-svn: 375380
* [IndVars] Eliminate loop exits with equivalent exit countsPhilip Reames2019-10-201-4/+28
| | | | | | | | | | | | We can end up with two loop exits whose exit counts are equivalent, but whose textual representation is different and non-obvious. For the sub-case where we have a series of exits which dominate one another (common), eliminate any exits which would iterate *after* a previous exit on the exiting iteration. As noted in the TODO being removed, I'd always thought this was a good idea, but I've now seen this in a real workload as well. Interestingly, in review, Nikita pointed out there's let another oppurtunity to leverage SCEV's reasoning. If we kept track of the min of dominanting exits so far, we could discharge exits with EC >= MDE. This is less powerful than the existing transform (since later exits aren't considered), but potentially more powerful for any case where SCEV can prove a >= b, but neither a == b or a > b. I don't have an example to illustrate that oppurtunity, but won't be suprised if we find one and return to handle that case as well. Differential Revision: https://reviews.llvm.org/D69009 llvm-svn: 375379
* [InstCombine] conditional sign-extend of high-bit-extract: 'or' pattern.Roman Lebedev2019-10-203-18/+23
| | | | | | | | | | | | | | In this pattern, all the "magic" bits that we'd `add` are all high sign bits, and in the value we'd be adding to they are all unset, not unexpectedly, so we can have an `or` there: https://rise4fun.com/Alive/ups It is possible that `haveNoCommonBitsSet()` should be taught about this pattern so that we never have an `add` variant, but the reasoning would need to be recursive (because of that `select`), so i'm not really sure that would be worth it just yet. llvm-svn: 375378
* Reverted r375254 as it has broken some build bots for a long time.Vladimir Vereschaka2019-10-201-53/+15
| | | | llvm-svn: 375375
* [InstCombine] Fold uadd.sat(a, b) == 0 and usub.sat(a, b) == 0Nikita Popov2019-10-201-0/+22
| | | | | | | | | | | | | This adds folds for comparing uadd.sat/usub.sat with zero: * uadd.sat(a, b) == 0 => a == 0 && b == 0 => (a | b) == 0 * usub.sat(a, b) == 0 => a <= b And inverted forms for !=. Differential Revision: https://reviews.llvm.org/D69224 llvm-svn: 375374
* [InstCombine] Shift amount reassociation in shifty sign bit test (PR43595)Roman Lebedev2019-10-203-26/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This problem consists of several parts: * Basic sign bit extraction - `trunc? (?shr %x, (bitwidth(x)-1))`. This is trivial, and easy to do, we have a fold for it. * Shift amount reassociation - if we have two identical shifts, and we can simplify-add their shift amounts together, then we likely can just perform them as a single shift. But this is finicky, has one-use restrictions, and shift opcodes must be identical. But there is a super-pattern where both of these work together. to produce sign bit test from two shifts + comparison. We do indeed already handle this in most cases. But since we get that fold transitively, it has one-use restrictions. And what's worse, in this case the right-shifts aren't required to be identical, and we can't handle that transitively: If the total shift amount is bitwidth-1, only a sign bit will remain in the output value. But if we look at this from the perspective of two shifts, we can't fold - we can't possibly know what bit pattern we'd produce via two shifts, it will be *some* kind of a mask produced from original sign bit, but we just can't tell it's shape: https://rise4fun.com/Alive/cM0 https://rise4fun.com/Alive/9IN But it will *only* contain sign bit and zeros. So from the perspective of sign bit test, we're good: https://rise4fun.com/Alive/FRz https://rise4fun.com/Alive/qBU Superb! So the simplest solution is to extend `reassociateShiftAmtsOfTwoSameDirectionShifts()` to also have a sudo-analysis mode that will ignore extra-uses, and will only check whether a) those are two right shifts and b) they end up with bitwidth(x)-1 shift amount and return either the original value that we sign-checking, or null. This does not have any functionality change for the existing `reassociateShiftAmtsOfTwoSameDirectionShifts()`. All that being said, as disscussed in the review, this yet again increases usage of instsimplify in instcombine as utility. Some day that may need to be reevaluated. https://bugs.llvm.org/show_bug.cgi?id=43595 Reviewers: spatel, efriedma, vsk Reviewed By: spatel Subscribers: xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68930 llvm-svn: 375371
* [SampleFDO] Add profile remapping support for profile on-demand loading usedWei Mi2019-10-181-15/+4
| | | | | | | | | | | | | | | | | | | | by ExtBinary format profile Profile on-demand loading was added for ExtBinary format profile in rL374233, but currently profile on-demand loading doesn't work well with profile remapping. The patch adds the support. Suppose a function in the current module has outline instance in the profile. The function name in the module is different from the name of the outline instance, but remapper knows the two names are equal. When loading profile on-demand, the outline instance has to be loaded with remapper's help. At the same time SampleProfileReaderItaniumRemapper is changed from a proxy of SampleProfileReader to a helper member in SampleProfileReader. Differential Revision: https://reviews.llvm.org/D68901 llvm-svn: 375295
* [CVP] setDeducedOverflowingFlags(): actually inc per-opcode statsRoman Lebedev2019-10-181-4/+4
| | | | | | | This is really embarrassing. Those are pointers, so that offsets the pointers, not the statistics pointed-by the pointer... llvm-svn: 375290
* [CVP] After proving that @llvm.with.overflow()/@llvm.sat() don't overflow, ↵Roman Lebedev2019-10-181-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | also try to prove other no-wrap Summary: CVP, unlike InstCombine, does not run till exaustion. It only does a single pass. When dealing with those special binops, if we prove that they can safely be demoted into their usual binop form, we do set the no-wrap we deduced. But when dealing with usual binops, we try to deduce both no-wraps. So if we convert e.g. @llvm.uadd.with.overflow() to `add nuw`, we won't attempt to check whether it can be `add nuw nsw`. This patch proposes to call `processBinOp()` on newly-created binop, which is identical to what we do for div/rem already. Reviewers: nikic, spatel, reames Reviewed By: nikic Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69183 llvm-svn: 375273
* [PGO][PGSO] SizeOpts changes.Hiroshi Yamauchi2019-10-181-15/+53
| | | | | | | | | | | | | | | | | Summary: (Split of off D67120) SizeOpts/MachineSizeOpts changes for profile guided size optimization. Reviewers: davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69070 llvm-svn: 375254
* [NFC][CVP] Count all the no-wraps we provedRoman Lebedev2019-10-181-20/+74
| | | | | | | | | | | | | | | | | | Summary: It looks like this is the only missing statistic in the CVP pass. Since we prove NSW and NUW separately i'd think we should count them separately too. Reviewers: nikic, spatel, reames Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68740 llvm-svn: 375230
* [InstCombine] Fix miscompile bug in canEvaluateShuffledBjorn Pettersson2019-10-181-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add restrictions in canEvaluateShuffled to prevent that we for example transform %0 = insertelement <2 x i16> undef, i16 %a, i32 0 %1 = srem <2 x i16> %0, <i16 2, i16 1> %2 = shufflevector <2 x i16> %1, <2 x i16> undef, <2 x i32> <i32 undef, i32 0> into %1 = insertelement <2 x i16> undef, i16 %a, i32 1 %2 = srem <2 x i16> %1, <i16 undef, i16 2> as having an undef denominator makes the srem undefined (for all vector elements). Fixes: https://bugs.llvm.org/show_bug.cgi?id=43689 Reviewers: spatel, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69038 llvm-svn: 375208
* [IndVars] Factor out some common code into a utility functionPhilip Reames2019-10-171-16/+13
| | | | | | As requested in review of D69009 llvm-svn: 375191
* [NFC][InstCombine] Some more preparatory cleanup for ↵Roman Lebedev2019-10-171-4/+4
| | | | | | dropRedundantMaskingOfLeftShiftInput() llvm-svn: 375153
* [IndVars] Split loop predication out of optimizeLoopExits [NFC]Philip Reames2019-10-171-11/+42
| | | | | | In the process of writing D69009, I realized we have two distinct sets of invariants within this single function, and basically no shared logic. The optimize loop exit transforms (including the new one in D69009) only care about *analyzeable* exits. Loop predication, on the other hand, has to reason about *all* exits. At the moment, we have the property (due to the requirement for an exact btc) that all exits are analyzeable, but that will likely change in the future as we add widenable condition support. llvm-svn: 375138
* [IndVars] Factor out a helper function for readability [NFC]Philip Reames2019-10-171-7/+20
| | | | llvm-svn: 375133
* JumpThreadingPass::UnfoldSelectInstr - silence static analyzer dyn_cast<> ↵Simon Pilgrim2019-10-171-1/+1
| | | | | | | | null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 375103
* [LoopIdiom] BCmp: check, not assert that loop exits exit out of the loop ↵Roman Lebedev2019-10-171-7/+8
| | | | | | | | | | | | | | | (PR43687) We can't normally stumble into that assertion because a tautological *conditional* `br` in loop body is required, one that always branches to loop latch. But that should have been always folded to an unconditional branch before we get it. But that is not guaranteed if the pass is run standalone. So let's just promote the assertion into a proper check. Fixes https://bugs.llvm.org/show_bug.cgi?id=43687 llvm-svn: 375100
* Reland: Dead Virtual Function EliminationOliver Stannard2019-10-172-36/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove dead virtual functions from vtables with replaceNonMetadataUsesWith, so that CGProfile metadata gets cleaned up correctly. Original commit message: Currently, it is hard for the compiler to remove unused C++ virtual functions, because they are all referenced from vtables, which are referenced by constructors. This means that if the constructor is called from any live code, then we keep every virtual function in the final link, even if there are no call sites which can use it. This patch allows unused virtual functions to be removed during LTO (and regular compilation in limited circumstances) by using type metadata to match virtual function call sites to the vtable slots they might load from. This information can then be used in the global dead code elimination pass instead of the references from vtables to virtual functions, to more accurately determine which functions are reachable. To make this transformation safe, I have changed clang's code-generation to always load virtual function pointers using the llvm.type.checked.load intrinsic, instead of regular load instructions. I originally tried writing this using clang's existing code-generation, which uses the llvm.type.test and llvm.assume intrinsics after doing a normal load. However, it is possible for optimisations to obscure the relationship between the GEP, load and llvm.type.test, causing GlobalDCE to fail to find virtual function call sites. The existing linkage and visibility types don't accurately describe the scope in which a virtual call could be made which uses a given vtable. This is wider than the visibility of the type itself, because a virtual function call could be made using a more-visible base class. I've added a new !vcall_visibility metadata type to represent this, described in TypeMetadata.rst. The internalization pass and libLTO have been updated to change this metadata when linking is performed. This doesn't currently work with ThinLTO, because it needs to see every call to llvm.type.checked.load in the linkage unit. It might be possible to extend this optimisation to be able to use the ThinLTO summary, as was done for devirtualization, but until then that combination is rejected in the clang driver. To test this, I've written a fuzzer which generates random C++ programs with complex class inheritance graphs, and virtual functions called through object and function pointers of different types. The programs are spread across multiple translation units and DSOs to test the different visibility restrictions. I've also tried doing bootstrap builds of LLVM to test this. This isn't ideal, because only classes in anonymous namespaces can be optimised with -fvisibility=default, and some parts of LLVM (plugins and bugpoint) do not work correctly with -fvisibility=hidden. However, there are only 12 test failures when building with -fvisibility=hidden (and an unmodified compiler), and this change does not cause any new failures for either value of -fvisibility. On the 7 C++ sub-benchmarks of SPEC2006, this gives a geomean code-size reduction of ~6%, over a baseline compiled with "-O2 -flto -fvisibility=hidden -fwhole-program-vtables". The best cases are reductions of ~14% in 450.soplex and 483.xalancbmk, and there are no code size increases. I've also run this on a set of 8 mbed-os examples compiled for Armv7M, which show a geomean size reduction of ~3%, again with no size increases. I had hoped that this would have no effect on performance, which would allow it to awlays be enabled (when using -fwhole-program-vtables). However, the changes in clang to use the llvm.type.checked.load intrinsic are causing ~1% performance regression in the C++ parts of SPEC2006. It should be possible to recover some of this perf loss by teaching optimisations about the llvm.type.checked.load intrinsic, which would make it worth turning this on by default (though it's still dependent on -fwhole-program-vtables). Differential revision: https://reviews.llvm.org/D63932 llvm-svn: 375094
* [ThinLTO] Import virtual method with single implementation in hybrid modeEugene Leviant2019-10-171-34/+43
| | | | | | Differential revision: https://reviews.llvm.org/D68782 llvm-svn: 375083
* [NFC] Fix unused var in release buildsJordan Rupprecht2019-10-161-0/+1
| | | | llvm-svn: 375053
OpenPOWER on IntegriCloud