summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Allow to limit the max number of iterationsJakub Kuderski2019-12-181-0/+41
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches InstCombine to accept a new parameter: maximum number of iterations over functions. InstCombine tries to simplify instructions by iterating over the whole function until the function stops changing. As a consequence, the last iteration before reaching a fixpoint visits all instructions in the worklist and never performs any rewrites. Bounding the number of iterations can have 2 benefits: * In case the users of the pass can make a good guess about the number of required iterations, we can save the time normally spent on the last iteration that doesn't change anything. * When the wants to use InstCombine as a cleanup pass, it may be enough to run just a few iterations and stop even before reaching a fixpoint. This can be also useful for implementing a lightweight pass pipeline (think `-O1`). This patch does not change the behavior of opt or Clang -- limiting the number of iterations is entirely opt-in. Reviewers: fhahn, davide, spatel, foad, nlopes, grosser, lebedev.ri, nikic, xbolva00 Reviewed By: spatel Subscribers: craig.topper, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71145
* Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISelstozer2019-12-184-6/+7
| | | | | | | | This reverts commit 1f3dd83cc1f2b8f72b9d59e2b4221b12fb7f9a95, reapplying commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
* [NFC][InstCombine] Autogenerate assume.ll testRoman Lebedev2019-12-181-40/+77
|
* [InstCombine] add tests for copysign; NFCSanjay Patel2019-12-181-0/+23
|
* Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel"stozer2019-12-184-7/+6
| | | | | | Reverted due to build failure on windows bots. This reverts commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462.
* [DebugInfo] Correctly handle salvaged casts and split fragments at ISelstozer2019-12-184-6/+7
| | | | | | | | | | | | | | | Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.
* [PowerPC] Add missing legalization for vector BSWAPNemanja Ivanovic2019-12-171-0/+97
| | | | | | | | We somehow missed doing this when we were working on Power9 exploitation. This just adds the missing legalization and cost for producing the vector intrinsics. Differential revision: https://reviews.llvm.org/D70436
* [LoopFusion] Move instructions from FC0.Latch to FC1.Latch.Whitney Tsang2019-12-174-44/+73
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary:This PR move instructions from FC0.Latch bottom up to the beginning of FC1.Latch as long as they are proven safe. To illustrate why this is beneficial, let's consider the following example: Before Fusion: header1: br header2 header2: br header2, latch1 latch1: br header1, preheader3 preheader3: br header3 header3: br header4 header4: br header4, latch3 latch3: br header3, exit3 After Fusion (before this PR): header1: br header2 header2: br header2, latch1 latch1: br header3 header3: br header4 header4: br header4, latch3 latch3: br header1, exit3 Note that preheader3 is removed during fusion before this PR. Notice that we cannot fuse loop2 with loop4 as there exists block latch1 in between. This PR move instructions from latch1 to beginning of latch3, and remove block latch1. LoopFusion is now able to fuse loop nest recursively. After Fusion (after this PR): header1: br header2 header2: br header3 header3: br header4 header4: br header2, latch3 latch3: br header1, exit3 Reviewer: kbarton, jdoerfert, Meinersbur, dmgreen, fhahn, hfinkel, bmahjour, etiotto Reviewed By: kbarton, Meinersbur Subscribers: hiraditya, llvm-commits Tag: LLVM Differential Revision: https://reviews.llvm.org/D71165
* [Attributor] H2S fix.Stefan Stipanovic2019-12-171-4/+16
| | | | | | | | | | Summary: Fixing issues that were noticed in D71521 Reviewers: jdoerfert, lebedev.ri, uenoku Subscribers: Differential Revision: https://reviews.llvm.org/D71564
* [Attributor][NFC] Add test for sle comparison in h2s.Stefan Stipanovic2019-12-171-0/+14
|
* [InstCombine][AMDGPU] Trim more components of *buffer_loadPiotr Sobczak2019-12-171-150/+150
| | | | | | | | | | | | | | Summary: Add trimming of unused components of s_buffer_load. Extend trimming of *buffer_load to also include unused components at the beginning of vectors and update offset. Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70315
* [LoopFusion] Restrict loop fusion to rotated loops.Kit Barton2019-12-165-640/+584
| | | | | | | | | | | | | | | | Summary: This patch restricts loop fusion to only consider rotated loops as valid candidates. This simplifies the analysis and transformation and aligns with other loop optimizations. Reviewers: jdoerfert, Meinersbur, dmgreen, etiotto, Whitney, fhahn, hfinkel Reviewed By: Meinersbur Subscribers: ormris, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71025
* [InstCombine] Teach removeBitcastsFromLoadStoreOnMinMax not to change the ↵Craig Topper2019-12-161-2/+5
| | | | | | | | | | size of a store. We can change the type as long as we don't change the size. Fixes PR44306 Differential Revision: https://reviews.llvm.org/D71532
* [BasicBlockUtils] Fix dbg.value elimination problem in MergeBlockIntoPredecessorBjorn Pettersson2019-12-162-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In commit d60f34c20a2f31335c8d5626e (llvm-svn 317128, PR35113) MergeBlockIntoPredecessor was changed into discarding some dbg.value intrinsics referring to PHI values, post-splice due to loop rotation. That elimination of dbg.value intrinsics did not consider which dbg.value to keep depending on the context (e.g. if the variable is changing its value several times inside the basic block). In the past that hasn't been such a big problem since CodeGenPrepare::placeDbgValues has moved the dbg.value to be next to the PHI node anyway. But after commit 00e238896cd8ad3a7d7 CodeGenPrepare isn't doing that any longer, so we need to be more careful when avoiding duplicate dbg.value intrinsics in MergeBlockIntoPredecessor. This patch replaces the code that tried to avoid duplicate dbg.values by using the RemoveRedundantDbgInstrs helper. Reviewers: aprantl, jmorse, vsk Reviewed By: aprantl, vsk Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71480
* [LoopRotate] Add test case to show dbg value problemBjorn Pettersson2019-12-161-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In commit d60f34c20a2f31335c8d5626e (llvm-svn 317128, PR35113) MergeBlockIntoPredecessor was changed into discarding some dbg.value intrinsics referring to PHI values, post-splice due to loop rotation. That elimination of dbg.value intrinsics does not consider which dbg.value to keep based on the context. Such as always keeping the one that comes first textually, or the need to keep several of them in case the variable is changing it's value several times inside the basic block. In the past that hasn't been such a big problem since CodeGenPrepare::placeDbgValues has moved the dbg.value to be next to the PHI node anyway. But after commit 00e238896cd8ad3a7d7 CodeGenPrepare isn't doing that any longer, so we need to be more careful when avoiding duplicate dbg.value intrinsics in MergeBlockIntoPredecessor. This patch is just a pre commit of the test case. Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71479
* [BasicBlockUtils] Add utility to remove redundant dbg.value instrsBjorn Pettersson2019-12-161-0/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a RemoveRedundantDbgInstrs to BasicBlockUtils with the goal to remove redundant dbg intrinsics from a basic block. This can be useful after various transforms, as it might be simpler to do a filtering of dbg intrinsics after the transform than during the transform. One primary use case would be to replace a too aggressive removal done by MergeBlockIntoPredecessor, seen at loop rotate (not done in this patch). The elimination algorithm currently focuses on dbg.value intrinsics and is doing two iterations over the BB. First we iterate backward starting at the last instruction in the BB. Whenever a consecutive sequence of dbg.value instructions are found we keep the last dbg.value for each variable found (variable fragments are identified using the {DILocalVariable, FragmentInfo, inlinedAt} triple as given by the DebugVariable helper class). Next we iterate forward starting at the first instruction in the BB. Whenever we find a dbg.value describing a DebugVariable (identified by {DILocalVariable, inlinedAt}) we save the {DIValue, DIExpression} that describes that variables value. But if the variable already was mapped to the same {DIValue, DIExpression} pair we instead drop the second dbg.value. To ease the process of making lit tests for this utility a new pass is introduced called RedundantDbgInstElimination. It can be executed by opt using -redundant-dbg-inst-elim. Reviewers: aprantl, jmorse, vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71478
* [InstSimplify] fold splat of inserted constant to vector constantSanjay Patel2019-12-152-11/+4
| | | | | | | | | | | | | | | | | shuf (inselt ?, C, IndexC), undef, <IndexC, IndexC...> --> <C, C...> This is another missing shuffle fold pattern uncovered by the shuffle correctness fix from D70246. The problem was visible in the post-commit thread example, but we managed to overcome the limitation for that particular case with D71220. This is something like the inverse of the previous fix - there we didn't demand the inserted scalar, and here we are only demanding an inserted scalar. Differential Revision: https://reviews.llvm.org/D71488
* [Attributor][Tests] Copy & use the ArgumentPromotion testsJohannes Doerfert2019-12-1433-0/+3744
|
* [ArgPromo][Tests] Run update_test_checks on all ArgumentPromotion testsJohannes Doerfert2019-12-1432-872/+1477
| | | | | | | | | | | | | | | | | Summary: In preparation of D65531 as well as the reuse of these tests for the Attributor, we modernize them and use the update_test_checks to simplify updates. This was done with the update_test_checks after D68819 and D68850. Reviewers: hfinkel, vsk, dblaikie, davidxl, tejohnson, tstellar, echristo, chandlerc, efriedma, lebedev.ri Subscribers: bollu, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68766
* [Attributor] Annotate call sites of declarations with a callbackJohannes Doerfert2019-12-132-5/+5
| | | | | Even if a declaration is called, if there is a callback we might need the information during CG-SCC traversal (D70767).
* [Attributor][NFC] Add more simple test situations for callbacksJohannes Doerfert2019-12-131-0/+31
|
* [Attributor][NFC] Reorder test functionsJohannes Doerfert2019-12-131-23/+23
| | | | | Since one of the functions has a personality the attribute set is printed. If the function is the first it should (hopefully) always be #0
* [Attributor] Reuse the IPConstantProp tests for the AttributorJohannes Doerfert2019-12-1324-0/+1442
| | | | | | | | | | | | | | | | | | | | The Attributor can, to some degree, do what IPConstantProp does. We can consequently use the corner cases already collected and tested for in the IPConstantProp tests to improve Attributor test coverage. This exposed various bugs fixed in previous Attributor patches. Not all functionality of IPConstantProp is available in AAValueSimplify and AAIsDead so some tests show that we cannot perform the expected constant propagation. Reviewers: fhahn, efriedma, mssimpso, davide Subscribers: bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69748
* [InstSimplify] improve test coverage for insert+splat; NFCSanjay Patel2019-12-131-10/+21
|
* [PGO][PGSO] Enable size optimizations in code gen / target passes for cold code.Hiroshi Yamauchi2019-12-131-0/+41
| | | | | | | | | | | | Summary: Split off of D67120. Reviewers: davidxl Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71288
* Reland [DataLayout] Fix occurrences that size and range of pointers are ↵Nicola Zaghen2019-12-136-10/+153
| | | | | | | | | | | | | | assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!
* Reapply [LVI] Normalize pointer behaviorNikita Popov2019-12-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a rebase of the change over D70376, which fixes an LVI cache invalidation issue that also affected this patch. ----- Related to D69686. As noted there, LVI currently behaves differently for integer and pointer values: For integers, the block value is always valid inside the basic block, while for pointers it is only valid at the end of the basic block. I believe the integer behavior is the correct one, and CVP relies on it via its getConstantRange() uses. The reason for the special pointer behavior is that LVI checks whether a pointer is dereferenced in a given basic block and marks it as non-null in that case. Of course, this information is valid only after the dereferencing instruction, or in conservative approximation, at the end of the block. This patch changes the treatment of dereferencability: Instead of including it inside the block value, we instead treat it as something similar to an assume (it essentially is a non-nullness assume) and incorporate this information in intersectAssumeOrGuardBlockValueConstantRange() if the context instruction is the terminator of the basic block. This happens either when determining an edge-value internally in LVI, or when a terminator was explicitly passed to getValueAt(). The latter case makes this change not fully NFC, because we can now fold terminator icmps based on the dereferencability information in the same block. This is the reason why I changed one JumpThreading test (it would optimize the condition away without the change). Of course, we do not want to recompute dereferencability on each intersectAssume call, so we need a new cache for this. The dereferencability analysis requires walking the entire basic block and computing underlying objects of all memory operands. This was previously done separately for each queried pointer value. In the new implementation (both because this makes the caching simpler, and because it is faster), I instead only walk the full BB once and cache all the dereferenced pointers. So the traversal is now performed only once per BB, instead of once per queried pointer value. I think the overall model now makes more sense than before, and there will be no more pitfalls due to differing integer/pointer behavior. Differential Revision: https://reviews.llvm.org/D69914
* [Attributor][FIX] Do treat byval arguments specialJohannes Doerfert2019-12-121-0/+52
| | | | | | | | | | | | | | When we reason about the pointer argument that is byval we actually reason about a local copy of the value passed at the call site. This was not the case before and we wrongly introduced attributes based on the surrounding function. AAMemoryBehaviorArgument, AAMemoryBehaviorCallSiteArgument and AANoCaptureCallSiteArgument are made aware of byval now. The code to skip "subsuming positions" for reasoning follows a common pattern and we should refactor it. A TODO was added. Discovered by @efriedma as part of D69748.
* [Matrix] Add first set of matrix intrinsics and initial lowering pass.Florian Hahn2019-12-1213-0/+1967
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456
* Temporarily Revert "[DataLayout] Fix occurrences that size and range of ↵Nicola Zaghen2019-12-126-153/+10
| | | | | | | | | pointers are assumed to be the same." This reverts commit 5f6208778ff92567c57d7c1e2e740c284d7e69a5. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
* [DataLayout] Fix occurrences that size and range of pointers are assumed to ↵Nicola Zaghen2019-12-126-10/+153
| | | | | | | | | | | | be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!
* [AutoFDO] Statistic for context sensitive profile guided inliningWenlei He2019-12-113-5/+109
| | | | | | | | | | | | Summary: AutoFDO compilation has two places that do inlining - the sample profile loader that does inlining with context sensitive profile, and the regular inliner as CGSCC pass. Ideally we want most inlining to come from sample profile loader as that is driven by context sensitive profile and also retains context sensitivity after inlining. However the reality is most of the inlining actually happens during regular inliner. To track the number of inline instances from sample profile loader and help move more inlining to sample profile loader, I'm adding statistics and optimization remarks for sample profile loader's inlining. Reviewers: wmi, davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70584
* [InstCombine] Optimize overflow check base on uadd.with.overflow resultNikita Popov2019-12-111-14/+7
| | | | | | | | | | | | | | | | | | | | | Fix for https://bugs.llvm.org/show_bug.cgi?id=40846. This adds a combine for cases where a (a + b) < a style overflow check is performed, but with a + b being the result of uadd.with.overflow, so the overflow result is also already available and we can just use it. Subsequently GVN/CSE will deduplicate the extracts. We can run into this situation if you have both a uadd.with.overflow and a manual add + overflow check in the same function (on the same operands), in which case GVN will rewrite the add to the with.overflow result and leave you with this pattern. The implementation is a bit ugly because I'm handling the various canonicalization edge cases. This does not yet handle the negated version of this pattern. Differential Revision: https://reviews.llvm.org/D58644
* [ValueTracking] Pointer is known nonnull after load/storeDanila Kutenin2019-12-114-38/+38
| | | | | | | | | | | If the pointer was loaded/stored before the null check, the check is redundant and can be removed. For now the optimizers do not remove the nullptr check, see https://gcc.godbolt.org/z/H2r5GG. The patch allows to use more nonnull constraints. Also, it found one more optimization in some PowerPC test. This is my first llvm review, I am free to any comments. Differential Revision: https://reviews.llvm.org/D71177
* [MergeFuncs] Remove incorrect attribute copyingNikita Popov2019-12-111-0/+30
| | | | | | | | | | | | | | | Fix for https://bugs.llvm.org/show_bug.cgi?id=44236. This code was originally introduced in rG36512330041201e10f5429361bbd79b1afac1ea1. However, the attribute copying was done in the wrong place (in general call replacement, not thunk generation) and a proper fix was implemented in D12581. Previously this code was just unnecessary but harmless (because FunctionComparator ensured that the attributes of the two functions are exactly the same), but since byval was changed to accept a type this copying is actively wrong and may result in malformed IR. Differential Revision: https://reviews.llvm.org/D71173
* AMDGPU: Fix copy-pasted test name errorMatt Arsenault2019-12-111-2/+2
|
* [ARM][TypePromotion] Enable by defaultSam Parker2019-12-1110-10/+10
| | | | | | | | | Enable the TypePromotion pass my default (again). This patch was originally committed in 393dacacf7e7. This patch was reverted in a38396939c54. Differential Revision: https://reviews.llvm.org/D70998
* Revert "Reapply: [DebugInfo] Recover debug intrinsics when killing ↵Vlad Tsyrklevich2019-12-102-138/+0
| | | | | | | | duplicated/empty..." This reverts commit f2ba93971ccc236c0eef5323704d31f48107e04f, it was causing build timeouts on sanitizer-x86_64-linux-autoconf such as http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/44917
* [InstSimplify] add tests for insert constant + splat; NFCSanjay Patel2019-12-102-0/+35
|
* [InstCombine] replace shuffle's insertelement operand if inserted scalar is ↵Sanjay Patel2019-12-101-2/+6
| | | | | | | | | | | | | | not demanded This pattern is noted as a regression from: D70246 ...where we removed an over-aggressive shuffle simplification. SimplifyDemandedVectorElts fails to catch this case when the insert has multiple uses, so I'm proposing to pattern match the minimal sequence directly. This fold does not conflict with any of our current shuffle undef/poison semantics. Differential Revision: https://reviews.llvm.org/D71220
* Reapply: [DebugInfo] Recover debug intrinsics when killing duplicated/empty...stozer2019-12-102-0/+138
| | | | | | | | | | | basic blocks Originally applied in 72ce759928e6dfee6a9efa310b966c19722352ba. Fixed a build failure caused by incorrect use of cast instead of dyn_cast. This reverts commit 8b0780f795eb58fca0a2456e308adaaa1a0b5013.
* add test for previous commitSam Parker2019-12-101-0/+66
|
* [IPConstantProp][NFCI] Improve and modernize testsJohannes Doerfert2019-12-0913-64/+99
| | | | | | | | | | | | | | | Summary: This change is in preparation to reuse these test for the Attributor. It mainly is to remove UB, make it clear what is tested, and use "modern" run lines. Reviewers: fhahn, efriedma, mssimpso, davide Subscribers: bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69747
* [ValueTracking] Allow context-sensitive nullness check for non-pointersJohannes Doerfert2019-12-094-13/+54
| | | | | | | | | | | | | | | | | | Summary: Same as D60846 and D69571 but with a fix for the problem encountered after them. Both times it was a missing context adjustment in the handling of PHI nodes. The reproducers created from the bugs that caused the old commits to be reverted are included. Reviewers: nikic, nlopes, mkazantsev, spatel, dlrobertson, uabelho, hakzsam, hans Subscribers: hiraditya, bollu, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71181
* [InstCombine] add tests for shuffle with insertelement operand; NFCSanjay Patel2019-12-091-0/+52
|
* [ARM] Add missing REQUIRES: asserts to test. NFCDavid Green2019-12-091-0/+1
|
* [ARM] Enable MVE masked loads and storesDavid Green2019-12-094-6/+6
| | | | | | | With the extra optimisations we have done, these should now be fine to enable by default. Which is what this patch does. Differential Revision: https://reviews.llvm.org/D70968
* [ARM] Teach the Arm cost model that a Shift can be folded into other ↵David Green2019-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | instructions This attempts to teach the cost model in Arm that code such as: %s = shl i32 %a, 3 %a = and i32 %s, %b Can under Arm or Thumb2 become: and r0, r1, r2, lsl #3 So the cost of the shift can essentially be free. To do this without trying to artificially adjust the cost of the "and" instruction, it needs to get the users of the shl and check if they are a type of instruction that the shift can be folded into. And so it needs to have access to the actual instruction in getArithmeticInstrCost, which if available is added as an extra parameter much like getCastInstrCost. We otherwise limit it to shifts with a single user, which should hopefully handle most of the cases. The list of instruction that the shift can be folded into include ADC, ADD, AND, BIC, CMP, EOR, MVN, ORR, ORN, RSB, SBC and SUB. This translates to Add, Sub, And, Or, Xor and ICmp. Differential Revision: https://reviews.llvm.org/D70966
* [ARM] Additional tests and minor formatting. NFCDavid Green2019-12-091-0/+86
| | | | | | This adds some extra cost model tests for shifts, and does some minor adjustments to some Neon code to make it clear as to what it applies to. Both NFC.
* Revert 393dacacf7e7 "[ARM] Enable TypePromotion by default"Hans Wennborg2019-12-0910-10/+10
| | | | | | | | | | | | This caused "Too many bits for uint64_t" asserts when building Chromium. See https://crbug.com/1031978#c2 for a reproducer. I'll follow up on the llvm-commits thread with a creduced version. > ARMCodeGenPrepare has already been generalized and renamed to > TypePromotion. We've had it enabled and tested downstream for a > while, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D70998
OpenPOWER on IntegriCloud