summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Refactor test checks (NFC)Evandro Menezes2019-02-011-16/+13
| | | | llvm-svn: 352895
* [InstCombine] Expand Windows test (NFC)Evandro Menezes2019-02-011-15/+44
| | | | | | Add checks for Win64 to existing cases. llvm-svn: 352892
* [InstCombine] Refactor test checks (NFC)Evandro Menezes2019-02-011-42/+24
| | | | llvm-svn: 352886
* [InstCombine] try to reduce x86 addcarry to generic uaddo intrinsicSanjay Patel2019-02-011-10/+12
| | | | | | | | | | | | | | | | | | | If we can reduce the x86-specific intrinsic to the generic op, it allows existing simplifications and value tracking folds. AFAICT, this always results in identical x86 codegen in the non-reduced case...which should be true because we semi-generically (too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must already combine/lower this intrinsic as expected. This isn't quite what was requested in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but we want to have these kinds of folds early for efficiency and to enable greater simplifications. For the case in the bug report where we have: _addcarry_u64(0, ahi, 0, &ahi) ...this gets completely simplified away in IR. Differential Revision: https://reviews.llvm.org/D57453 llvm-svn: 352870
* Provide reason messages for unviable inliningYevgeny Rouban2019-02-011-0/+10
| | | | | | | | | | | | | InlineCost's isInlineViable() is changed to return InlineResult instead of bool. This provides messages for failure reasons and allows to get more specific messages for cases where callsites are not viable for inlining. Reviewed By: xbolva00, anemet Differential Revision: https://reviews.llvm.org/D57089 llvm-svn: 352849
* Lower widenable_conditions in CGPPhilip Reames2019-01-311-0/+93
| | | | | | | | This ensures that if we make it to the backend w/o lowering widenable_conditions first, that we generate correct code. Doing it in CGP - instead of isel - let's us fold control flow before hitting block local instruction selection. Differential Revision: https://reviews.llvm.org/D57473 llvm-svn: 352779
* Recommit "[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT ↵Teresa Johnson2019-01-312-0/+42
| | | | | | | | leader" Recommit of r352763 with fix for use after free. llvm-svn: 352770
* revert r352766: [PatternMatch] add special-case uaddo matching for ↵Sanjay Patel2019-01-311-25/+20
| | | | | | | | increment-by-one Missed some regression test updates when testing this. llvm-svn: 352769
* Revert "[ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leader"Teresa Johnson2019-01-312-42/+0
| | | | | | | | | | | | This reverts commit r352763. Causing a couple bot failures, root cause pointed to by sanitizer bot: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/28909/steps/annotate/logs/stdio Use after free. I understand the issue but will revert and test with fix before recommitting. llvm-svn: 352768
* [PatternMatch] add special-case uaddo matching for increment-by-oneSanjay Patel2019-01-311-20/+25
| | | | | | | | | | | | | | | | | This is the most important uaddo problem mentioned in PR31754: https://bugs.llvm.org/show_bug.cgi?id=31754 We were failing to match the canonicalized pattern when it's an 'add 1' operation. Pattern matching, however, shouldn't assume that we have canonicalized IR, so we match 4 commuted variants of uaddo. There's also a test with a crazy type to show that the existing CGP transform based on this matcher is not limited by target legality checks, but that's a different problem. Differential Revision: https://reviews.llvm.org/D57516 llvm-svn: 352766
* [ThinLTO] Rename COMDATs for COFF when promoting/renaming COMDAT leaderTeresa Johnson2019-01-312-0/+42
| | | | | | | | | | | | | | | | | | Summary: COFF requires that COMDAT name match that of the leader. When we promote and rename an internal leader in ThinLTO due to an import, ensure we subsequently rename the associated COMDAT. Similar to D31963 which did this during ThinLTO module splitting. Fixes PR40414. Reviewers: pcc, inglorion Subscribers: mehdi_amini, dexonsmith, dmajor, llvm-commits Differential Revision: https://reviews.llvm.org/D57395 llvm-svn: 352763
* [CGP] add more tests for uaddo; NFCSanjay Patel2019-01-311-0/+71
| | | | llvm-svn: 352762
* Default lowering for experimental.widenable.conditionMax Kazantsev2019-01-311-0/+44
| | | | | | | | | | | Introduces a pass that provides default lowering strategy for the `experimental.widenable.condition` intrinsic, replacing all its uses with `i1 true`. Differential Revision: https://reviews.llvm.org/D56096 Reviewed By: reames llvm-svn: 352739
* Commit tests for changes in revision D41940Dmitry Venikov2019-01-312-0/+98
| | | | llvm-svn: 352734
* [InstCombine] Missed optimization in math expression: simplify calls exp ↵Dmitry Venikov2019-01-312-30/+22
| | | | | | | | | | | | | | | | functions Summary: This patch enables folding following expressions under -ffast-math flag: exp(X) * exp(Y) -> exp(X + Y), exp2(X) * exp2(Y) -> exp2(X + Y). Motivation: https://bugs.llvm.org/show_bug.cgi?id=35594 Reviewers: hfinkel, spatel, efriedma, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D41342 llvm-svn: 352730
* [SCEV] Prohibit SCEV transformations for huge SCEVsMax Kazantsev2019-01-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | Currently SCEV attempts to limit transformations so that they do not work with big SCEVs (that may take almost infinite compile time). But for this, it uses heuristics such as recursion depth and number of operands, which do not give us a guarantee that we don't actually have big SCEVs. This situation is still possible, though it is not likely to happen. However, the bug PR33494 showed a bunch of simple corner case tests where we still produce huge SCEVs, even not reaching big recursion depth etc. This patch introduces a concept of 'huge' SCEVs. A SCEV is huge if its expression size (intoduced in D35989) exceeds some threshold value. We prohibit optimizing transformations if any of SCEVs we are dealing with is huge. This gives us a reliable check that we don't spend too much time working with them. As the next step, we can possibly get rid of old limiting mechanisms, such as recursion depth thresholds. Differential Revision: https://reviews.llvm.org/D35990 Reviewed By: reames llvm-svn: 352728
* Revert "Reapply "[CGP] Check for existing inttotpr before creating new one""David L. Jones2019-01-311-140/+0
| | | | | | | | This change reverts r351626. The changes in r351626 cause quadratic work in several cases. (See r351626 thread on llvm-commits for details.) llvm-svn: 352722
* [InstCombine] Expand testing for Windows (NFC)Evandro Menezes2019-01-311-48/+67
| | | | | | Added the checks to the existing cases when the target is Win64. llvm-svn: 352714
* [InstCombine] Simplify check clauses in test (NFC)Evandro Menezes2019-01-311-71/+57
| | | | llvm-svn: 352707
* Add a 'dynamic' parameter to the objectsize intrinsicErik Pilkington2019-01-309-60/+143
| | | | | | | | | | | | | | This is meant to be used with clang's __builtin_dynamic_object_size. When 'true' is passed to this parameter, the intrinsic has the potential to be folded into instructions that will be evaluated at run time. When 'false', the objectsize intrinsic behaviour is unchanged. rdar://32212419 Differential revision: https://reviews.llvm.org/D56761 llvm-svn: 352664
* [Tests] Add tests for propagation of undef elements in vector GEPsPhilip Reames2019-01-301-0/+25
| | | | llvm-svn: 352662
* SimplifyDemandedVectorElts for all intrinsicsPhilip Reames2019-01-301-8/+8
| | | | | | | | | | The point is that this simplifies integration of new intrinsics into SimplifiedDemandedVectorElts, and ensures we don't miss any existing ones. This is intended to be NFC-ish, but as seen from the diffs, can produce slightly different output. This is due to order of transforms w/in instcombine resulting in two slightly different fixed points. That's something we should fix, but isn't a problem w/this patch per se. Differential Revision: https://reviews.llvm.org/D57398 llvm-svn: 352653
* [InstCombine][x86] add tests for addcarry intrinsic; NFCSanjay Patel2019-01-301-0/+36
| | | | llvm-svn: 352627
* Commit tests for changes in revision D41342Dmitry Venikov2019-01-302-0/+178
| | | | llvm-svn: 352613
* Check bool attribute value in getOptionalBoolLoopAttribute.Alina Sbirlea2019-01-291-0/+95
| | | | | | | | | | | | | | | Summary: Check the bool value of the attribute in getOptionalBoolLoopAttribute not just its existance. Eliminates the warning noise generated when vectorization is explicitly disabled. Reviewers: Meinersbur, hfinkel, dmgreen Subscribers: jlebar, sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D57260 llvm-svn: 352555
* [InstCombine] canonicalize cmp/select form of uadd saturate with constantSanjay Patel2019-01-291-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | I'm circling back around to a loose end from D51929. The backend (either CGP or DAG) doesn't recognize this pattern, so we end up with different asm for these IR variants. Regardless of any future changes to canonicalize to saturation/overflow intrinsics, we want to get raw IR variations into the minimal number of raw IR forms. If/when we can canonicalize to intrinsics, that will make that step easier. Pre: C2 == ~C1 %a = add i32 %x, C1 %c = icmp ugt i32 %x, C2 %r = select i1 %c, i32 -1, i32 %a => %a = add i32 %x, C1 %c2 = icmp ult i32 %x, C2 %r = select i1 %c2, i32 %a, i32 -1 https://rise4fun.com/Alive/pkH Differential Revision: https://reviews.llvm.org/D57352 llvm-svn: 352536
* [InstCombine] regenerate test checks; NFCSanjay Patel2019-01-291-63/+63
| | | | llvm-svn: 352517
* [InstCombine] add tests for ext-of-bool + add/sub; NFCSanjay Patel2019-01-291-0/+87
| | | | | | | | | | | | | | | We should choose one of these as canonical: %z = zext i1 %cmp to i32 %r = sub i32 %x, %z => %s = sext i1 %cmp to i32 %r = add i32 %x, %s The test comments assume that the zext form is better, but we can adjust that if we decide to go the other way. llvm-svn: 352515
* [IPCP] Don't crash due to arg count/type mismatch between caller/calleeBjorn Pettersson2019-01-292-0/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch avoids an assert in IPConstantPropagation when there is a argument count/type mismatch between the caller and the callee. While this is actually UB on C-level (clang emits a warning), the IR verifier seems to accept it. I'm not sure what other frontends/languages might think about this, so simply bailing out to avoid hitting an assert (in CallSiteBase<>::getArgOperand or Value::doRAUW) seems like a simple solution. The problem is exposed by the fact that AbstractCallSites will look through a bitcast at the callee position of a call/invoke. Reviewers: jdoerfert, reames, efriedma Reviewed By: jdoerfert, efriedma Subscribers: eli.friedman, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D57052 llvm-svn: 352469
* Correct contents for r352453Philip Reames2019-01-291-72/+48
| | | | | | I had a local change I hadn't realized when submitting that auto-update. As such, the auto-update was wrong. This should fix it, and with that, it's clearly time to stop submitting changes and go to bed. llvm-svn: 352454
* [Tests] Regen to remove future test diffsPhilip Reames2019-01-291-128/+152
| | | | | | This file appears to have been manually editted at some point after being auto-updated. A future change adjusts this file slightly, and all of the updates makes the diff super confusing. llvm-svn: 352453
* [Test] Add tests for gather/maked.load demanded elements, and convert the ↵Philip Reames2019-01-291-17/+67
| | | | | | whole file to auto generated checks. llvm-svn: 352452
* Demanded elements support for vector GEPsPhilip Reames2019-01-281-15/+9
| | | | | | | | GEPs can produce either scalar or vector results. If we're extracting only a subset of the vector lanes, simplifying the operands is helpful in eliminating redundant computation, and (eventually) allowing further optimizations Differential Revision: https://reviews.llvm.org/D57177 llvm-svn: 352440
* [CGP] auto-generate complete checks for add overflow tests; NFCSanjay Patel2019-01-281-24/+50
| | | | llvm-svn: 352437
* [InstCombine] add another saturating uadd test (no undefs); NFCSanjay Patel2019-01-281-2/+15
| | | | | | I forgot that our undef matching hasn't been completed in the previous commit. llvm-svn: 352424
* [InstCombine] add tests for saturating uadd with constant; NFCSanjay Patel2019-01-281-0/+55
| | | | llvm-svn: 352423
* [CodeExtractor] Add support for the `swifterror` attributeVedant Kumar2019-01-281-0/+43
| | | | | | | When passing a `swifterror` argument or alloca as an input to an extraction region, mark the input parameter `swifterror`. llvm-svn: 352408
* [TTI] Add generic SADDSAT/SSUBSAT costsSimon Pilgrim2019-01-272-200/+344
| | | | | | | | | | Add generic costs calculation for SADDSAT/SSUBSAT intrinsics, this uses generic costs for sadd_with_overflow/ssub_with_overflow, an extra sign comparison + a selects based on the sign/overflow. This completes PR40316 Differential Revision: https://reviews.llvm.org/D57239 llvm-svn: 352315
* [ValueTracking] Look through casts when determining non-nullnessJohannes Doerfert2019-01-267-77/+176
| | | | | | | | | | Bitcast and certain Ptr2Int/Int2Ptr instructions will not alter the value of their operand and can therefore be looked through when we determine non-nullness. Differential Revision: https://reviews.llvm.org/D54956 llvm-svn: 352293
* [PowerPC] Update Vector Costs for P9Nemanja Ivanovic2019-01-261-0/+39
| | | | | | | | | | | | | For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost functions to reflect the higher cost of vector operations when targeting Power9. Patch by RolandF. Differential revision: https://reviews.llvm.org/D55461 llvm-svn: 352261
* [HotColdSplit] Introduce a cost model to control splitting behaviorVedant Kumar2019-01-2510-93/+125
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main goal of the model is to avoid *increasing* function size, as that would eradicate any memory locality benefits from splitting. This happens when: - There are too many inputs or outputs to the cold region. Argument materialization and reloads of outputs have a cost. - The cold region has too many distinct exit blocks, causing a large switch to be formed in the caller. - The code size cost of the split code is less than the cost of a set-up call. A secondary goal is to prevent excessive overall binary size growth. With the cost model in place, I experimented to find a splitting threshold that works well in practice. To make warm & cold code easily separable for analysis purposes, I moved split functions to a "cold" section. I experimented with thresholds between [0, 4] and set the default to the threshold which minimized geomean __text size. Experiment data from building LNT+externals for X86 (N = 639 programs, all sizes in bytes): | Configuration | __text geom size | __cold geom size | TEXT geom size | | **-Os** | 1736.3 | 0, n=0 | 10961.6 | | -Os, thresh=0 | 1740.53 | 124.482, n=134 | 11014 | | -Os, thresh=1 | 1734.79 | 57.8781, n=90 | 10978.6 | | -Os, thresh=2 | ** 1733.85 ** | 65.6604, n=61 | 10977.6 | | -Os, thresh=3 | 1733.85 | 65.3071, n=61 | 10977.6 | | -Os, thresh=4 | 1735.08 | 67.5156, n=54 | 10965.7 | | **-Oz** | 1554.4 | 0, n=0 | 10153 | | -Oz, thresh=2 | ** 1552.2 ** | 65.633, n=61 | 10176 | | **-O3** | 2563.37 | 0, n=0 | 13105.4 | | -O3, thresh=2 | ** 2559.49 ** | 71.1072, n=61 | 13162.4 | Picking thresh=2 reduces the geomean __text section size by 0.14% at -Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that TEXT size is page-aligned, whereas section sizes are byte-aligned. Experiment data from building LNT+externals for ARM64 (N = 558 programs, all sizes in bytes): | Configuration | __text geom size | __cold geom size | TEXT geom size | | **-Os** | 1763.96 | 0, n=0 | 42934.9 | | -Os, thresh=2 | ** 1760.9 ** | 76.6755, n=61 | 42934.9 | Picking thresh=2 reduces the geomean __text section size by 0.17% at -Os and causes no growth in the TEXT segment. Measurements were done with D57082 (r352080) applied. Differential Revision: https://reviews.llvm.org/D57125 llvm-svn: 352228
* [NFC] One more crashing test on LoopSimplifyCFGMax Kazantsev2019-01-251-0/+116
| | | | llvm-svn: 352194
* [NFC] Add failing test on LCSSA formingMax Kazantsev2019-01-251-0/+43
| | | | llvm-svn: 352190
* [NFC] Add test with multiple loopsMax Kazantsev2019-01-251-0/+55
| | | | llvm-svn: 352176
* [LoopSimplifyCFG] Fix inconsistency in blocks in loop markupMax Kazantsev2019-01-251-3/+5
| | | | | | | | | | | | 2nd part of D57095 with the same reason, just in another place. We never fold branches that are not immediately in the current loop, but this check is missing in `IsEdgeLive` As result, it may think that the edge in subloop is dead while it's live. It's a pessimization in the current stance. Differential Revision: https://reviews.llvm.org/D57147 Reviewed By: rupprecht llvm-svn: 352170
* [HotColdSplit] Split more aggressively before/after cold invokesVedant Kumar2019-01-251-0/+31
| | | | | | | | | | | While a cold invoke itself and its unwind destination can't be extracted, code which unconditionally executes before/after the invoke may still be profitable to extract. With cost model changes from D57125 applied, this gives a 3.5% increase in split text across LNT+externals on arm64 at -Os. llvm-svn: 352160
* [Analysis] Fix isSafeToLoadUnconditionally handling of volatile.Eli Friedman2019-01-241-0/+12
| | | | | | | | | A volatile operation cannot be used to prove an address points to normal memory. (LangRef was recently updated to state it explicitly.) Differential Revision: https://reviews.llvm.org/D57040 llvm-svn: 352109
* [MemorySSA +LICM CFHoist] Solve PR40317.Alina Sbirlea2019-01-241-0/+62
| | | | | | | | | | | | | | | Summary: MemorySSA needs updating each time an instruction is moved. LICM and control flow hoisting re-hoists instructions, thus needing another update when re-moving those instructions. Pending cleanup: the MSSA update is duplicated, should be moved inside moveInstructionBefore. Reviewers: jnspaulsson Subscribers: sanjoy, jlebar, Prazek, george.burgess.iv, llvm-commits Differential Revision: https://reviews.llvm.org/D57176 llvm-svn: 352092
* Test cases for demanded elements on vector GEPsPhilip Reames2019-01-241-0/+127
| | | | | | This is the first part of splitting apart https://reviews.llvm.org/D57140 into usuable pieces. Landing the tests in advance of posting a review specifically for the demanded elements part. llvm-svn: 352091
* [RS4GC] Expand/standardize tests introduced in rL352059Philip Reames2019-01-241-4/+121
| | | | | | Write a couple of variations on vector geps w/both scalars and vectors live over safepoints. Use update_test_checks to show all the IR. llvm-svn: 352062
OpenPOWER on IntegriCloud