summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* [AliasAnalysis/NewPassManager] Invalidate AAManager less often.Alina Sbirlea2019-04-301-4/+3
| | | | | | | | | | | | | | | | | | | | | Summary: This is a redo of D60914. The objective is to not invalidate AAManager, which is stateless, unless there is an explicit invalidate in one of the AAResults. To achieve this, this patch adds an API to PAC, to check precisely this: is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61284 llvm-svn: 359622
* Revert rL359519 : [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.Simon Pilgrim2019-04-301-53/+0
| | | | | | | | | | | | | | | | | | Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 ........ This was causing windows build bot failures llvm-svn: 359555
* [ARM] Implement TTI::getMemcpyCostSjoerd Meijer2019-04-301-4/+662
| | | | | | | | | This implements TargetTransformInfo method getMemcpyCost, which estimates the number of instructions to which a memcpy instruction expands to. Differential Revision: https://reviews.llvm.org/D59787 llvm-svn: 359547
* [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.Alina Sbirlea2019-04-291-0/+53
| | | | | | | | | | | | | | | | Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359519
* [PowerPC] Update P9 vector costs for insert/extract elementRoland Froese2019-04-261-24/+24
| | | | | | | | | | | The PPC vector cost model values for insert/extract element reflect older processors that lacked vector insert/extract and move-to/move-from VSR instructions. Update getVectorInstrCost to give appropriate values for when the newer instructions are present. Differential Revision: https://reviews.llvm.org/D60160 llvm-svn: 359313
* Revert [AliasAnalysis] AAResults preserves AAManager.Alina Sbirlea2019-04-241-3/+4
| | | | | | Triggers use-after-free. llvm-svn: 359055
* [Lint] Permit aliasing noalias readonly argumentsJosh Stone2019-04-231-0/+40
| | | | | | | | | | | | | | | | | | Summary: If two arguments are both readonly, then they have no memory dependency that would violate noalias, even if they do actually overlap. Reviewers: hfinkel, efriedma Reviewed By: efriedma Subscribers: efriedma, hiraditya, llvm-commits, tstellar Tags: #llvm Differential Revision: https://reviews.llvm.org/D60239 llvm-svn: 359047
* [AliasAnalysis] AAResults preserves AAManager.Alina Sbirlea2019-04-231-4/+3
| | | | | | | | | | | | | | | | Summary: AAResults should not invalidate AAManager. Update tests. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60914 llvm-svn: 359014
* [ARM] Rewrite isLegalT2AddressImmediateDavid Green2019-04-211-65/+594
| | | | | | | | | | | | | | | | | | | | | This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 llvm-svn: 358845
* [PowerPC] Add some PPC vec cost tests to prep for D60160 NFCRoland Froese2019-04-182-16/+172
| | | | llvm-svn: 358699
* [SDA] Bug fix: Use IPD outside the loop as divergence boundNicolai Haehnle2019-04-181-0/+37
| | | | | | | | | | | | | | | | | | Summary: The immediate post dominator of the loop header may be part of the divergent loop. Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: mmasten, arsenm, jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59042 llvm-svn: 358681
* [AMDGPU] Flag new raw/struct atomic ops as source of divergenceTim Renouf2019-04-171-0/+200
| | | | | | | Differential Revision: https://reviews.llvm.org/D60731 Change-Id: I821d93dec8b9cdd247b8172d92fb5e15340a9e7d llvm-svn: 358579
* [CostModel][X86] Add bool anyof/allof reduction costsSimon Pilgrim2019-04-174-184/+96
| | | | | | | | On pre-AVX512 targets we can use MOVMSK to extract reduced boolean results. This is properly optimized, annoyingly AVX512 isn't and produces code that is almost as bad as the (unchanged) costs suggest...... Differential Revision: https://reviews.llvm.org/D60403 llvm-svn: 358574
* [MemorySSA] Add previous def to cache when found, even if trivial.Alina Sbirlea2019-04-121-0/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When inserting a new Def, MemorySSA may be have non-minimal number of Phis. While inserting, the walk to find the previous definition may cleanup minimal Phis. When the last definition is trivial to obtain, we do not cache it. It is possible while getting the previous definition for a Def to get two different answers: - one that was straight-forward to find when walking the first path (a trivial phi in this case), and - another that follows a cleanup of the trivial phi, it determines it may need additional Phi nodes, it inserts them and returns a new phi in the same position as the former trivial one. While the Phis added for the second path are all redundant, they are not complete (the walk is only done upwards), and they are not properly cleaned up afterwards. A way to fix this problem is to cache the straight-forward answer we got on the first walk. The caching is only kept for the duration of a getPreviousDef call, and for Phis we use TrackingVH, so removing the trivial phi will lead to replacing it with the next dominating phi in the cache. Resolves PR40749. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60634 llvm-svn: 358313
* [MemorySSA] Small fix for the clobber limit.Alina Sbirlea2019-04-121-0/+131
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: After introducing the limit for clobber walking, `walkToPhiOrClobber` would assert that the limit is at least 1 on entry. The test included triggered that assert. The callsite in `tryOptimizePhi` making the calls to `walkToPhiOrClobber` is structured like this: ``` while (true) { if (getBlockingAccess()) { // calls walkToPhiOrClobber } for (...) { walkToPhiOrClobber(); } } ``` The cleanest fix is to check if the limit was reached inside `walkToPhiOrClobber`, and give an allowence of 1. This approach not make any alias() calls (no calls to instructionClobbersQuery), so the performance condition is enforced. The limit is set back to 0 if not used, as this provides info on the fact that we stopped before reaching a true clobber. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60479 llvm-svn: 358303
* [CostModel][X86] Add more exhaustive masked ↵Simon Pilgrim2019-04-062-72/+2050
| | | | | | load/store/gather/scatter/expand/compress cost tests llvm-svn: 357838
* [LCG] Add aliased functions as LCG rootsGuozhi Wei2019-04-051-0/+38
| | | | | | | | | Current LCG doesn't check aliased functions. So if an internal function has a public alias it will not be added to CG SCC, but it is still reachable from outside through the alias. So this patch adds aliased functions to SCC. Differential Revision: https://reviews.llvm.org/D59898 llvm-svn: 357795
* [SCEV] Check the cache in get{S|U}MaxExpr before doing any workSanjoy Das2019-03-291-0/+156
| | | | | | | | | | | | | | | | | | Summary: This lets us avoid e.g. checking if A >=s B in getSMaxExpr(A, B) if we've already established that (A smax B) is the best we can do. Fixes PR41225. Reviewers: asbirlea Subscribers: mcrosier, jlebar, bixia, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60010 llvm-svn: 357320
* [MemorySSA] Limit clobber walks.Alina Sbirlea2019-03-292-25/+50
| | | | | | | | | | | | | | Summary: This patch limits all getClobberingMemoryAccess() walks to MaxCheckLimit. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59569 llvm-svn: 357319
* [MemorySSA] Don't optimize incomplete phis.Alina Sbirlea2019-03-291-0/+62
| | | | | | | | | | | | | | | | Summary: MemoryPhis cannot be optimized out until they are complete. Resolves PR41254. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59966 llvm-svn: 357315
* IR: Support parsing numeric block ids, and emit them in textual output.James Y Knight2019-03-2215-159/+159
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just as as llvm IR supports explicitly specifying numeric value ids for instructions, and emits them by default in textual output, now do the same for blocks. This is a slightly incompatible change in the textual IR format. Previously, llvm would parse numeric labels as string names. E.g. define void @f() { br label %"55" 55: ret void } defined a label *named* "55", even without needing to be quoted, while the reference required quoting. Now, if you intend a block label which looks like a value number to be a name, you must quote it in the definition too (e.g. `"55":`). Previously, llvm would print nameless blocks only as a comment, and would omit it if there was no predecessor. This could cause confusion for readers of the IR, just as unnamed instructions did prior to the addition of "%5 = " syntax, back in 2008 (PR2480). Now, it will always print a label for an unnamed block, with the exception of the entry block. (IMO it may be better to print it for the entry-block as well. However, that requires updating many more tests.) Thus, the following is supported, and is the canonical printing: define i32 @f(i32, i32) { %3 = add i32 %0, %1 br label %4 4: ret i32 %3 } New test cases covering this behavior are added, and other tests updated as required. Differential Revision: https://reviews.llvm.org/D58548 llvm-svn: 356789
* [TTI] getMemcpyCostSjoerd Meijer2019-03-201-0/+13
| | | | | | | | | | This adds new function getMemcpyCost to TTI so that the cost of a memcpy can be modeled and queried. The default implementation returns Expensive, but targets can override this function to model the cost more accurately. Differential Revision: https://reviews.llvm.org/D59252 llvm-svn: 356555
* AMDGPU: Partially fix default device for HSAMatt Arsenault2019-03-171-2/+2
| | | | | | | | | | | | | | | | | | There are a few different issues, mostly stemming from using generation based checks for anything instead of subtarget features. Stop adding flat-address-space as a feature for HSA, as it should only be a device property. This was incorrectly allowing flat instructions to select for SI. Increase the default generation for HSA to avoid the encoding error when emitting objects. This has some other side effects from various checks which probably should be separate subtarget features (in the cost model and for dealing with the DS offset folding issue). Partial fix for bug 41070. It should probably be an error to try using amdhsa without flat support. llvm-svn: 356347
* [AMDGPU] Prepare for introduction of v3 and v5 MVTsTim Renouf2019-03-178-7/+105
| | | | | | | | | | | | | | | | | | | AMDGPU would like to have MVTs for v3i32, v3f32, v5i32, v5f32. This commit does not add them, but makes preparatory changes: * Fixed assumptions of power-of-2 vector type in kernel arg handling, and added v5 kernel arg tests and v3/v5 shader arg tests. * Added v5 tests for cost analysis. * Added vec3/vec5 arg test cases. Some of this patch is from Matt Arsenault, also of AMD. Differential Revision: https://reviews.llvm.org/D58928 Change-Id: I7279d6b4841464d2080eb255ef3c589e268eabcd llvm-svn: 356342
* [SCEV] Use depth limit for trunc analysisTeresa Johnson2019-03-121-1/+29
| | | | | | | | | | | | | | | | | | | Summary: This fixes an extremely long compile time caused by recursive analysis of truncs, which were not previously subject to any depth limits unlike some of the other ops. I decided to use the same control used for sext/zext, since the routines analyzing these are sometimes mutually recursive with the trunc analysis. Reviewers: mkazantsev, sanjoy Subscribers: sanjoy, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58994 llvm-svn: 355949
* [AMDGPU] Add support for 64 bit buffer atomic artihmetic instructionsRyan Taylor2019-03-062-60/+60
| | | | | | | | | | | | | | | | Summary: This adds support for 64 bit buffer atomic arithmetic instructions but does not include cmpswap as that depends on a fix to the way the register pairs are handled Change-Id: Ib207ea65fb69487ccad5066ea647ae8ddfe2ce61 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58918 llvm-svn: 355520
* [SCEV] Handle case where MaxBECount is less precise than ExactBECount for OR.Florian Hahn2019-03-021-0/+49
| | | | | | | | | | | | | | | | | | | | | | | | In some cases, MaxBECount can be less precise than ExactBECount for AND and OR (the AND case was PR26207). In the OR test case, both ExactBECounts are undef, but MaxBECount are different, so we hit the assertion below. This patch uses the same solution the AND case already uses. Assertion failed: ((isa<SCEVCouldNotCompute>(ExactNotTaken) || !isa<SCEVCouldNotCompute>(MaxNotTaken)) && "Exact is not allowed to be less precise than Max"), function ExitLimit This patch also consolidates test cases for both AND and OR in a single test case. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13245 Reviewers: sanjoy, efriedma, mkazantsev Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D58853 llvm-svn: 355259
* [MemorySSA] Make insertDef insert corresponding phi nodes.Alina Sbirlea2019-02-272-0/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The original assumption for the insertDef method was that it would not materialize Defs out of no-where, hence it will not insert phis needed after inserting a Def. However, when cloning an instruction (use case used in LICM), we do materialize Defs "out of no-where". If the block receiving a Def has at least one other Def, then no processing is needed. If the block just received its first Def, we must check where Phi placement is needed. The only new usage of insertDef is in LICM, hence the trigger for the bug. But the original goal of the method also fails to apply for the move() method. If we move a Def from the entry point of a diamond to either the left or right blocks, then the merge block must add a phi. While this usecase does not currently occur, or may be viewed as an incorrect transformation, MSSA must behave corectly given the scenario. Resolves PR40749 and PR40754. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, Prazek, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58652 llvm-svn: 355040
* [MemorySSA & SimpleLoopUnswitch] Update MemorySSA in ReplaceUsesOfWith.Alina Sbirlea2019-02-261-0/+153
| | | | | | | SimpleLoopUnswitch must update MemorySSA when removing instructions. Resolves PR39197. llvm-svn: 354919
* [TTI] Add generic cost model for smul/umul overflow intrinsicsSimon Pilgrim2019-02-251-36/+378
| | | | | | Based off smul/umul fixed costs and the implementation in TargetLowering::expandMULO. llvm-svn: 354784
* Fixed typos in tests: s/CEHCK/CHECK/Dmitri Gribenko2019-02-252-2/+2
| | | | | | | | | | | | Reviewers: ilya-biryukov Subscribers: sanjoy, sdardis, javed.absar, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58608 llvm-svn: 354781
* [TTI] Add generic cost model for fixed point smul/umulSimon Pilgrim2019-02-251-36/+378
| | | | | | | | Based on an IR equivalent of target lowering's generic expansion - target specific costs will typically be lower (IR doesn't have a good mull/mulh equivalent) but we need a baseline. Differential Revision: https://reviews.llvm.org/D57925 llvm-svn: 354774
* [MemorySSA] Update test with minimized one. NFCIAlina Sbirlea2019-02-221-98/+23
| | | | llvm-svn: 354658
* [MemorySSA & LoopPassManager] Resolve PR40038.Alina Sbirlea2019-02-221-0/+83
| | | | | | | | | | | | | | The correct edge being deleted is not to the unswitched exit block, but to the original block before it was split. That's the key in the map, not the value. The insert is correct. The new edge is to the .split block. The splitting turns OriginalBB into: OriginalBB -> OriginalBB.split. Assuming the orignal CFG edge: ParentBB->OriginalBB, we must now delete ParentBB->OriginalBB, not ParentBB->OriginalBB.split. llvm-svn: 354656
* [MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks.Alina Sbirlea2019-02-211-0/+108
| | | | | | | MemorySSA is now updated when forming dedicated exit blocks. Resolves PR40037. llvm-svn: 354623
* [BPI] Look through bitcasts in calcZeroHeuristicSam Parker2019-02-151-0/+103
| | | | | | | | | | | | Constant hoisting may have hidden a constant behind a bitcast so that it isn't folded into its users. However, this prevents BPI from calculating some of its heuristics that are based upon constant values. So, I've added a simple helper function to look through these casts. Differential Revision: https://reviews.llvm.org/D58166 llvm-svn: 354119
* [MemorySSA] Remove verifyClobberSanity.Alina Sbirlea2019-02-111-0/+54
| | | | | | | | | | | | | | | | | | Summary: This verification may fail after certain transformations due to BasicAA's fragility. Added a small explanation and a testcase that triggers the assert in checkClobberSanity (before its removal). Addresses PR40509. Reviewers: george.burgess.iv Subscribers: sanjoy, jlebar, llvm-commits, Prazek Tags: #llvm Differential Revision: https://reviews.llvm.org/D57973 llvm-svn: 353739
* [CostModel][X86] Add UMUL fixed point cost tests Simon Pilgrim2019-02-051-0/+63
| | | | llvm-svn: 353153
* [SCEV] Do not bother creating separate SCEVUnknown for unreachable nodesMax Kazantsev2019-02-041-1/+1
| | | | | | | | | | | Currently, SCEV creates SCEVUnknown for every node of unreachable code. If we have a huge amounts of such code, we will be littering SE with these nodes. We could just state that they all are undef and save some memory. Differential Revision: https://reviews.llvm.org/D57567 Reviewed By: sanjoy llvm-svn: 353017
* [DA][NewPM] Handle transitive dependencies in the new-pm version of DAPhilip Pfaffe2019-02-031-0/+16
| | | | | | | | | | | | | | | | Summary: The analysis result of DA caches pointers to AA, SCEV, and LI, but it never checks for their invalidation. Fix that. Reviewers: chandlerc, dmgreen, bogner Reviewed By: dmgreen Subscribers: hiraditya, bollu, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D56381 llvm-svn: 352986
* [SCEV] Prohibit SCEV transformations for huge SCEVsMax Kazantsev2019-01-311-0/+41
| | | | | | | | | | | | | | | | | | | | | | Currently SCEV attempts to limit transformations so that they do not work with big SCEVs (that may take almost infinite compile time). But for this, it uses heuristics such as recursion depth and number of operands, which do not give us a guarantee that we don't actually have big SCEVs. This situation is still possible, though it is not likely to happen. However, the bug PR33494 showed a bunch of simple corner case tests where we still produce huge SCEVs, even not reaching big recursion depth etc. This patch introduces a concept of 'huge' SCEVs. A SCEV is huge if its expression size (intoduced in D35989) exceeds some threshold value. We prohibit optimizing transformations if any of SCEVs we are dealing with is huge. This gives us a reliable check that we don't spend too much time working with them. As the next step, we can possibly get rid of old limiting mechanisms, such as recursion depth thresholds. Differential Revision: https://reviews.llvm.org/D35990 Reviewed By: reames llvm-svn: 352728
* [SCEV] Take correct loop in AddRec simplification. PR40420Max Kazantsev2019-01-291-1/+0
| | | | | | | | | | | | The code of AddRec simplification is using wrong loop when it creates a new AddRecExpr. It should be using AddRecLoop which we have saved and against which all gate checks are made, and not calling AddRec->getLoop() over and over again because AddRec may change and become an AddRecurrency from outer loop during the transform iterations. Considering this change trivial, commiting for postcommit review. llvm-svn: 352451
* [NFC] Merge failing test from PR40420Max Kazantsev2019-01-291-0/+51
| | | | llvm-svn: 352450
* [CodeGen][X86] Expand UADDSAT to NOT+UMIN+ADDNikita Popov2019-01-281-4/+4
| | | | | | | | | Followup to D56636, this time handling the UADDSAT case by expanding uadd.sat(a, b) to umin(a, ~b) + b. Differential Revision: https://reviews.llvm.org/D56869 llvm-svn: 352409
* [AMDGPU] Add intrinsics for 16 bit interpolationTim Corringham2019-01-281-0/+25
| | | | | | | | | | | | | | | | | | | Summary: Added the intrinsics llvm.amdgcn.interp.p1.f16() and llvm.amdgcn.interp.p2.f16() and related LIT test. The p1 intrinsic generates code appropriate for both 16 and 32 bank LDS. Reviewers: #amdgpu, dstuttard, arsenm, tpr Reviewed By: #amdgpu, arsenm Subscribers: jvesely, mgorny, arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D46754 llvm-svn: 352357
* [TTI] Add generic SADDSAT/SSUBSAT costsSimon Pilgrim2019-01-271-196/+234
| | | | | | | | | | Add generic costs calculation for SADDSAT/SSUBSAT intrinsics, this uses generic costs for sadd_with_overflow/ssub_with_overflow, an extra sign comparison + a selects based on the sign/overflow. This completes PR40316 Differential Revision: https://reviews.llvm.org/D57239 llvm-svn: 352315
* [PowerPC] Update Vector Costs for P9Nemanja Ivanovic2019-01-261-0/+68
| | | | | | | | | | | | | For the power9 CPU, vector operations consume a pair of execution units rather than one execution unit like a scalar operation. Update the target transform cost functions to reflect the higher cost of vector operations when targeting Power9. Patch by RolandF. Differential revision: https://reviews.llvm.org/D55461 llvm-svn: 352261
* [CostModel][X86] Add SMUL fixed point cost tests Simon Pilgrim2019-01-241-0/+75
| | | | llvm-svn: 352046
* [TTI] Add generic SADDO/SSUBO costsSimon Pilgrim2019-01-241-36/+378
| | | | | | Added x86 scalar sadd_with_overflow/ssub_with_overflow costs. llvm-svn: 352045
* [TTI] Add generic UADDSAT/USUBSAT costsSimon Pilgrim2019-01-241-162/+181
| | | | | | | | Add generic costs calculation for UADDSAT/USUBSAT intrinsics, this fallbacks to using generic costs for uadd_with_overflow/usub_with_overflow + a select. Differential Revision: https://reviews.llvm.org/D56907 llvm-svn: 352044
OpenPOWER on IntegriCloud