summaryrefslogtreecommitdiffstats
path: root/llvm/test/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* [SystemZ] Add support for new cpu architecture - arch13Ulrich Weigand2019-07-123-18/+212
| | | | | | | | | | | | | | | | | | This patch series adds support for the next-generation arch13 CPU architecture to the SystemZ backend. This includes: - Basic support for the new processor and its features. - Assembler/disassembler support for new instructions. - CodeGen for new instructions, including new LLVM intrinsics. - Scheduler description for the new processor. - Detection of arch13 as host processor. Note: No currently available Z system supports the arch13 architecture. Once new systems become available, the official system name will be added as supported -march name. llvm-svn: 365932
* [SCEV] teach SCEV symbolical execution about overflow intrinsics folding.Chen Zheng2019-07-111-0/+128
| | | | | | Differential Revision: https://reviews.llvm.org/D64422 llvm-svn: 365726
* [LoopRotate + MemorySSA] Keep an <instruction-cloned instruction> map.Alina Sbirlea2019-07-101-0/+26
| | | | | | | | | | | | | | | | | | | | Summary: The map kept in loop rotate is used for instruction remapping, in order to simplify the clones of instructions. Thus, if an instruction can be simplified, its simplified value is placed in the map, even when the clone is added to the IR. MemorySSA in contrast needs to know about that clone, so it can add an access for it. To resolve this: keep a different map for MemorySSA. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63680 llvm-svn: 365672
* Add, and infer, a nofree function attributeBrian Homerding2019-07-081-2/+2
| | | | | | | | | | | | This patch adds a function attribute, nofree, to indicate that a function does not, directly or indirectly, call a memory-deallocation function (e.g., free, C++'s operator delete). Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D49165 llvm-svn: 365336
* Revert "[IRBuilder] Fold consistently for or/and whether constant is LHS or RHS"Petr Hosek2019-07-071-5/+19
| | | | | | | | | | This reverts commit r365260 which broke the following tests: Clang :: CodeGenCXX/cfi-mfcall.cpp Clang :: CodeGenObjC/ubsan-nullability.m LLVM :: Transforms/LoopVectorize/AArch64/pr36032.ll llvm-svn: 365284
* [IRBuilder] Fold consistently for or/and whether constant is LHS or RHSPhilip Reames2019-07-061-19/+5
| | | | | | Without this, we have the unfortunate property that tests are dependent on the order of operads passed the CreateOr and CreateAnd functions. In actual usage, we'd promptly optimize them away, but it made tests slightly more verbose than they should have been. llvm-svn: 365260
* Teach ValueTracking that aarch64.irg result aliases its input.Evgeniy Stepanov2019-07-031-0/+18
| | | | | | | | | | | | Reviewers: javed.absar, olista01 Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D64103 llvm-svn: 365079
* Revert Recommit [PowerPC] Update P9 vector costs for insert/extract elementJordan Rupprecht2019-07-011-24/+24
| | | | | | | | This reverts r364557 (git commit 9f7f5858fe46b8e706e87a83e2fd0a2678be619e) This crashes as reported on the commit thread. Repro instructions TBD. llvm-svn: 364876
* Update -analyze -scalar-evolution output for multiple exit loops ↵Philip Reames2019-06-271-0/+4
| | | | | | | | w/computable exit values The previous output was next to useless if *any* exit was not computable. If we have more than one exit, show the exit count for each so that it's easier to see what's going from with SCEV analysis when debugging. llvm-svn: 364579
* Recommit [PowerPC] Update P9 vector costs for insert/extract elementRoland Froese2019-06-271-24/+24
| | | | | | Recommit patch D60160 after regression fix patch D63463. llvm-svn: 364557
* [LoopUnroll] Add support for loops with exiting headers and uncond latches.Florian Hahn2019-06-261-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | This patch generalizes the UnrollLoop utility to support loops that exit from the header instead of the latch. Usually, LoopRotate would take care of must of those cases, but in some cases (e.g. -Oz), LoopRotate does not kick in. Codesize impact looks relatively neutral on ARM64 with -Oz + LTO. Program master patch diff External/S.../CFP2006/447.dealII/447.dealII 629060.00 627676.00 -0.2% External/SPEC/CINT2000/176.gcc/176.gcc 1245916.00 1244932.00 -0.1% MultiSourc...Prolangs-C/simulator/simulator 86100.00 86156.00 0.1% MultiSourc...arks/Rodinia/backprop/backprop 66212.00 66252.00 0.1% MultiSourc...chmarks/Prolangs-C++/life/life 67276.00 67312.00 0.1% MultiSourc...s/Prolangs-C/compiler/compiler 69824.00 69788.00 -0.1% MultiSourc...Prolangs-C/assembler/assembler 86672.00 86696.00 0.0% Reviewers: efriedma, vsk, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D61962 llvm-svn: 364398
* [LICM & MSSA] Fixed test to run only with assertions enabled as it uses ↵Yevgeny Rouban2019-06-211-0/+1
| | | | | | -debug-only llvm-svn: 364005
* [LICM & MSSA] Limit unsafe sinking and hoisting.Alina Sbirlea2019-06-201-0/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The getClobberingMemoryAccess API checks for clobbering accesses in a loop by walking the backedge. This may check if a memory access is being clobbered by the loop in a previous iteration, depending how smart AA got over the course of the updates in MemorySSA (it does not occur when built from scratch). If no clobbering access is found inside the loop, it will optimize to an access outside the loop. This however does not mean that access is safe to sink. Given: ``` for i load a[i] store a[i] ``` The access corresponding to the load can be optimized to outside the loop, and the load can be hoisted. But it is incorrect to sink it. In order to sink the load, we'd need to check no Def clobbers the Use in the same iteration. With this patch we currently restrict sinking to either Defs not existing in the loop, or Defs preceding the load in the same block. An easy extension is to ensure the load (Use) post-dominates all Defs. Caught by PR42294. This issue also shed light on the converse problem: hoisting stores in this same scenario would be illegal. With this patch we restrict hoisting of stores to the case when their corresponding Defs are dominating all Uses in the loop. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63582 llvm-svn: 363982
* [MemorySSA] Cleanup trivial phis.Alina Sbirlea2019-06-191-0/+113
| | | | | | | | | | | | | | | | | | | Summary: This is unfortunately needed for correctness, if we are to extend the tolerance of the update API to the way simple loop unswitch is doing cloning. In simple loop unswitch (as opposed to loop unswitch), not all blocks are cloned. This can create unreachable cloned blocks (no predecessor), which are later cleaned up. In MemorySSA, the APIs for supporting these kind of updates (clone + update exit blocks), make certain assumption on the integrity of the CFG. When cloning, if something was not cloned, it's values in MemorySSA default to LiveOnEntry. When updating exit blocks, it is safe to assume that we can first insert phis in the blocks merging two clones, then add additional phis in the IDF of the blocks that received phis. This no longer holds true if one of the clones being merged comes from an unreachable block. We'd conservatively need to add all phis before filling in their incoming definitions. In practice this restriction can be relaxed if we clean up trivial phis after the first round of insertion. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63354 llvm-svn: 363880
* [ConstantFolding] Add constant folding for smul.fix and smul.fix.satBjorn Pettersson2019-06-192-0/+244
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch teaches ConstantFolding to constant fold both scalar and vector variants of llvm.smul.fix and llvm.smul.fix.sat. As described in the LangRef rounding is unspecified for these instrinsics. If the result cannot be represented exactly the default behavior in ConstantFolding is to round down towards negative infinity. If a target has a preferred rounding that is different some kind of target hook would be needed (same strategy as used by the SelectionDAG legalizer). Reviewers: nikic, leonardchan, RKSimon Reviewed By: leonardchan Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63385 llvm-svn: 363811
* [MemorySSA] Don't use template when the clone is a simplified instruction.Alina Sbirlea2019-06-171-0/+27
| | | | | | | | | | | | | | | | | | | Summary: LoopRotate doesn't create a faithful clone of an instruction, it may simplify it beforehand. Hence the clone of an instruction that has a MemoryDef associated may not be a definition, but a use or not a memory alternig instruction. Don't rely on the template when the clone may be simplified. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63355 llvm-svn: 363597
* [MemorySSA] Add all MemoryPhis before filling their values.Alina Sbirlea2019-06-171-0/+51
| | | | | | | | | | | | | | | | | | Summary: Add all MemoryPhis in IDF before filling in their incomign values. Otherwise, a new Phi can be added that needs to become the incoming value of another Phi. Test fails the verification in verifyPrevDefInPhis. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63353 llvm-svn: 363590
* [lit] Delete empty lines at the end of lit.local.cfg NFCFangrui Song2019-06-174-4/+0
| | | | llvm-svn: 363538
* [SCEV] Use unsigned/signed intersection type in SCEVNikita Popov2019-06-155-11/+11
| | | | | | | | | | | | Based on D59959, this switches SCEV to use unsigned/signed range intersection based on the sign hint. This will prefer non-wrapping ranges in the relevant domain. I've left the one intersection in getRangeForAffineAR() to use the smallest intersection heuristic, as there doesn't seem to be any obvious preference there. Differential Revision: https://reviews.llvm.org/D60035 llvm-svn: 363490
* [AMDGPU] ImmArg and SourceOfDivergence for permlane/dppStanislav Mekhanoshin2019-06-131-0/+40
| | | | | | | | | Added missing ImmArg and SourceOfDivergence to the crosslane intrinsics. Differential Revision: https://reviews.llvm.org/D63216 llvm-svn: 363276
* Improve reduction intrinsics by overloading result value.Sander de Smalen2019-06-1319-4144/+4144
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch uses the mechanism from D62995 to strengthen the definitions of the reduction intrinsics by letting the scalar result/accumulator type be overloaded from the vector element type. For example: ; The LLVM LangRef specifies that the scalar result must equal the ; vector element type, but this is not checked/enforced by LLVM. declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a) This patch changes that into: declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a) Which has the type-constraint more explicit and causes LLVM to check the result type with the vector element type. Reviewers: RKSimon, arsenm, rnk, greened, aemerson Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D62996 llvm-svn: 363240
* LoopDistribute/LAA: Respect convergentMatt Arsenault2019-06-121-0/+73
| | | | | | | | | | | | | | | | | | This case is slightly tricky, because loop distribution should be allowed in some cases, and not others. As long as runtime dependency checks don't need to be introduced, this should be OK. This is further complicated by the fact that LoopDistribute partially ignores if LAA says that vectorization is safe, and then does its own runtime pointer legality checks. Note this pass still does not handle noduplicate correctly, as this should always be forbidden with it. I'm not going to bother trying to fix it, as it would require more effort and I think noduplicate should be removed. https://reviews.llvm.org/D62607 llvm-svn: 363160
* [MemorySSA] When applying updates, clean unnecessary Phis.Alina Sbirlea2019-06-111-0/+78
| | | | | | | | | | | | | | Summary: After applying a set of insert updates, there may be trivial Phis left over. Clean them up. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63033 llvm-svn: 363094
* [DA] Add an option to control delinearization validity checksWhitney Tsang2019-06-101-0/+237
| | | | | | | | | | | | | | | | | Summary: Dependence Analysis performs static checks to confirm validity of delinearization. These checks often fail for 64-bit targets due to type conversions and integer wrapping that prevent simplification of the SCEV expressions. These checks would also fail at compile-time if the lower bound of the loops are compile-time unknown. Author: bmahjour Reviewer: Meinersbur, jdoerfert, kbarton, dmgreen, fhahn Reviewed By: Meinersbur, jdoerfert, dmgreen Subscribers: fhahn, hiraditya, javed.absar, llvm-commits, Whitney, etiotto Tag: LLVM Differential Revision: https://reviews.llvm.org/D62610 llvm-svn: 362952
* [ARM] Adjust isLegalT1AddressImmediate for non-legal typesDavid Green2019-06-081-17/+17
| | | | | | | | | | | Types such as float and i64's do not have legal loads in Thumb1, but will still be loaded with a LDR (or potentially multiple LDR's). As such we can treat the cost of addressing mode calculations the same as an i32 and get some optimisation benefits. Differential Revision: https://reviews.llvm.org/D62968 llvm-svn: 362874
* [ARM] Add MVE addressing to isLegalT2AddressImmediateDavid Green2019-06-081-30/+30
| | | | | | | | | | Now with MVE being added, we can add the vector addressing mode costs for it. These are generally imm7 multiplied by the size of the type being loaded / stored. Differential Revision: https://reviews.llvm.org/D62967 llvm-svn: 362873
* [ARM] Add fp16 addressing to isLegalT2AddressImmediateDavid Green2019-06-081-11/+11
| | | | | | | | | | The fp16 version of VLDR takes a imm8 multiplied by 2. This updates the costs to account for those, and adds extra testing. It is dependant upon hasFPRegs16 as this is what the load/store instructions require. Differential Revision: https://reviews.llvm.org/D62966 llvm-svn: 362872
* [ARM] Add extra gep costmodel tests for MVE and half float. NFCDavid Green2019-06-081-73/+553
| | | | llvm-svn: 362871
* [CFLGraph] Add support for unary fneg instruction.Craig Topper2019-06-061-1/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D62791 llvm-svn: 362737
* [RISCV] Disable test/Analysis/CostModel/RISCV tests if RISCV backend not builtLuis Marques2019-06-061-0/+3
| | | | | | Adds missing lit.local.cfg. Fixes rL362691. llvm-svn: 362693
* [RISCV] Add CostModel GEP testsLuis Marques2019-06-061-0/+189
| | | | | | Differential Revision: https://reviews.llvm.org/D61185 llvm-svn: 362691
* TTI: Improve default costs for addrspacecastMatt Arsenault2019-06-031-6/+27
| | | | | | | | | | For some reason multiple places need to do this, and the variant the loop unroller and inliner use was not handling it. Also, introduce a new wrapper to be slightly more precise, since on AMDGPU some addrspacecasts are free, but not no-ops. llvm-svn: 362436
* [CostModel][X86] Improve masked load/store AVX1/AVX2 costsSimon Pilgrim2019-06-022-76/+76
| | | | | | | | | | | | | | | | | | | | A mixture of internal tests and review of the scheduler models indicates we're overestimating the cost of a masked load, which we're estimating at 4x regular memory ops - more realistic values indicates that its closer to 2x. Masked stores costs are a lot more diverse but 8x is roughly in the middle of the range. e.g. SandyBridge defm : X86WriteRes<WriteFMaskedLoad, [SBPort23,SBPort05], 8, [1,2], 3>; defm : X86WriteRes<WriteFMaskedLoadY, [SBPort23,SBPort05], 9, [1,2], 3>; defm : X86WriteRes<WriteFMaskedStore, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>; defm : X86WriteRes<WriteFMaskedStoreY, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>; e.g. Btver2 defm : X86WriteRes<WriteFMaskedLoad, [JLAGU, JFPU01, JFPX], 6, [1, 2, 2], 1>; defm : X86WriteRes<WriteFMaskedLoadY, [JLAGU, JFPU01, JFPX], 6, [2, 4, 4], 2>; defm : X86WriteRes<WriteFMaskedStore, [JSAGU, JFPU01, JFPX], 6, [1, 1, 4], 1>; defm : X86WriteRes<WriteFMaskedStoreY, [JSAGU, JFPU01, JFPX], 6, [2, 2, 4], 2>; Differential Revision: https://reviews.llvm.org/D61257 llvm-svn: 362338
* [CostModel][X86] Add bool vector and/or/xor cost testsSimon Pilgrim2019-05-301-0/+192
| | | | llvm-svn: 362083
* [CostModel] Add really basic support for being able to query the cost of the ↵Craig Topper2019-05-281-34/+78
| | | | | | | | | | | | | | | | | | | | | | | FNeg instruction. Summary: This reuses the getArithmeticInstrCost, but passes dummy values of the second operand flags. The X86 costs are wrong and can be improved in a follow up. I just wanted to stop it from reporting an unknown cost first. Reviewers: RKSimon, spatel, andrew.w.kaylor, cameron.mcinally Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62444 llvm-svn: 361788
* [MustExecute] Improve MustExecute to correctly handle loop nestXing Xue2019-05-271-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: for.outer: br for.inner for.inner: LI <loop invariant load instruction> for.inner.latch: br for.inner, for.outer.latch for.outer.latch: br for.outer, for.outer.exit LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock(). Current algorithm of allLoopPathsLeadToBlock() 1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch 2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false 3. return true Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this. Author: Whitney (committed by xingxue) Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn Reviewed By: jdoerfert Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour Tags: #LLVM Differential Revision: https://reviews.llvm.org/D62418 llvm-svn: 361762
* [X86] Add test cases for D62444. NFCCraig Topper2019-05-271-0/+171
| | | | llvm-svn: 361745
* [SimplifyCFG] Added condition assumption for unreachable blocksDavid Bolvansky2019-05-251-0/+2
| | | | | | | | | | | | | | | | Summary: PR41688 Reviewers: spatel, efriedma, craig.topper, hfinkel, reames Reviewed By: hfinkel Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61409 llvm-svn: 361707
* [NFC] Update test checksDavid Bolvansky2019-05-251-0/+1
| | | | llvm-svn: 361695
* [CodeMetrics] Don't let extends of i1 be free.Jonas Paulsson2019-05-171-0/+53
| | | | | | | | | | | | | | | | | | getUserCost() currently returns TCC_Free for any extend of a compare (i1) result. It seems this is only true in a limited number of cases where for example two compares are chained. Even in those types of cases it seems unlikely that they are generally free, while they may be in some cases. This patch therefore removes this special handling of cast of i1. No tests are failing because of this. If some target want the old behavior, it could override getUserCost(). Review: Hal Finkel, Chandler Carruth, Evgeny Astigeevich, Simon Pilgrim, Ulrich Weigand https://reviews.llvm.org/D54742/new/ llvm-svn: 360970
* [MemorySSA] LoopSimplify preserves MemorySSA only when flag is flipped.Alina Sbirlea2019-05-141-0/+16
| | | | | | | | | | | LoopSimplify can preserve MemorySSA after r360270. But the MemorySSA analysis is retrieved and preserved only when the EnableMSSALoopDependency is set to true. Use the same conditional to mark the pass as preserved, otherwise subsequent passes will get an invalid analysis. Resolves PR41853. llvm-svn: 360697
* [CostModel][X86] Add min/max reduction costs for all SSE targetsSimon Pilgrim2019-05-118-559/+559
| | | | | | | | The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference). I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1. llvm-svn: 360528
* [MemorySSA] Fix CHECKs in test. [NFC]Alina Sbirlea2019-05-071-2/+5
| | | | llvm-svn: 360201
* [SCEV] Add explicit representations of umin/sminKeno Fischer2019-05-078-9/+59
| | | | | | | | | | | | | | | | | | Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159
* Add FNeg support to InstructionSimplifyCameron McInally2019-05-061-11/+0
| | | | | | Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053
* Precommit an FNeg InstructionSimplify test.Cameron McInally2019-05-051-0/+11
| | | | llvm-svn: 359990
* Add FNeg IR constant folding supportCameron McInally2019-05-051-0/+42
| | | | llvm-svn: 359982
* [MemorySSA] Check that block is reachable when adding phis.Alina Sbirlea2019-05-021-0/+103
| | | | | | | | | | | | | | | | | | | Summary: Originally the insertDef method was only used when building MemorySSA, and was limiting the number of Phi nodes that it created. Now it's used for updates as well, and it can create additional Phis needed for correctness. Make sure no Phis are created in unreachable blocks (condition met during MSSA build), otherwise the renamePass will find a null DTNode. Resolves PR41640. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61410 llvm-svn: 359845
* Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract ↵David L. Jones2019-05-011-24/+24
| | | | | | | | element" This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313. llvm-svn: 359648
* [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated.Alina Sbirlea2019-04-301-0/+50
| | | | | | | | | | | | | | | | Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: LLVM Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359627
OpenPOWER on IntegriCloud