summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Handle -(X-Y) --> (Y-X) for unary fneg when NSZCameron McInally2019-06-111-1/+10
| | | | | | Differential Revision: https://reviews.llvm.org/D62612 llvm-svn: 363082
* [InstCombine] Update fptrunc (fneg x)) -> (fneg (fptrunc x) for unary FNegCameron McInally2019-06-111-4/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D62629 llvm-svn: 363080
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-06-112-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. I have set up a separate review D61933 for a fix which is required for this patch. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse Reviewed By: hfinkel, jmorse Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 363046
* Change semantics of fadd/fmul vector reductions.Sander de Smalen2019-06-111-8/+4
| | | | | | | | | | | | | | | | | | | | This patch changes how LLVM handles the accumulator/start value in the reduction, by never ignoring it regardless of the presence of fast-math flags on callsites. This change introduces the following new intrinsics to replace the existing ones: llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul and adds functionality to auto-upgrade existing LLVM IR and bitcode. Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60261 llvm-svn: 363035
* [PGO] Fix the buildbot failure in r362995Rong Xu2019-06-101-5/+2
| | | | | | Fixed one unused variable warning. llvm-svn: 363004
* [PGO] Handle cases of non-instrument BBsRong Xu2019-06-101-38/+85
| | | | | | | | | | | As shown in PR41279, some basic blocks (such as catchswitch) cannot be instrumented. This patch filters out these BBs in PGO instrumentation. It also sets the profile count to the fail-to-instrument edge, so that we can propagate the counts in the CFG. Differential Revision: https://reviews.llvm.org/D62700 llvm-svn: 362995
* [LFTR] Use recomputed BE countPhilip Reames2019-06-101-9/+1
| | | | | | | | | | | | This was discussed as part of D62880. The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening. Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion. This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable. For the nestedIV example from elim-extend.ll, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) Note that before is an i32 type, and the after is an i64. Truncating the i64 produces the i32. llvm-svn: 362975
* Prepare for multi-exit LFTR [NFC]Philip Reames2019-06-101-65/+77
| | | | | | | | | | | | | | | | This change does the plumbing to wire an ExitingBB parameter through the LFTR implementation, and reorganizes the code to work in terms of a set of individual loop exits. Most of it is fairly obvious, but there's one key complexity which makes it worthy of consideration. The actual multi-exit LFTR patch is in D62625 for context. Specifically, it turns out the existing code uses the backedge taken count from before a IV is widened. Oddly, we can end up with a different (more expensive, but semantically equivelent) BE count for the loop when requerying after widening. For the nestedIV example from elim-extend, we end up with the following BE counts: BEFORE: (-2 + (-1 * %innercount) + %limit) AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>) This is the only test in tree which seems sensitive to this difference. The actual result of using the wider BETC on this example is that we actually produce slightly better code. :) In review, we decided to accept that test change. This patch is structured to preserve the old behavior, but a separate change will immediate follow with the behavior change. (I wanted it separate for problem attribution purposes.) Differential Revision: https://reviews.llvm.org/D62880 llvm-svn: 362971
* [InstCombine] allow unordered preds when canonicalizing to fabs()Sanjay Patel2019-06-101-2/+2
| | | | | | | | | | We have a known-never-nan value via 'nnan', so an unordered predicate is the same as its ordered sibling. Similar to: rL362937 llvm-svn: 362954
* [InstCombine] fix bug in canonicalization to fabs()Sanjay Patel2019-06-101-2/+4
| | | | | | Forgot to translate the predicate clauses in rL362943. llvm-svn: 362945
* [InstCombine] change canonicalization to fabs() to use FMF on fsubSanjay Patel2019-06-101-19/+21
| | | | | | | | | | | | | | Similar to rL362909: This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fsub because they have the same operand. llvm-svn: 362943
* [InstCombine] allow unordered preds when canonicalizing to fabs()Sanjay Patel2019-06-101-2/+4
| | | | | | | PR42179: https://bugs.llvm.org/show_bug.cgi?id=42179 llvm-svn: 362937
* Do not derive no-recurse attribute if function does not have exact definition.Vivek Pandya2019-06-101-1/+1
| | | | | | | | | | | This is fix for https://bugs.llvm.org/show_bug.cgi?id=41336 Reviewers: jdoerfert Reviewed by: jdoerfert Differential Revision: https://reviews.llvm.org/D63045 llvm-svn: 362918
* [InstCombine] foldICmpWithLowBitMaskedVal(): 'icmp sgt/sle': avoid miscompilesRoman Lebedev2019-06-091-0/+8
| | | | | | | | | | | | | A precondition 'x != 0' was forgotten by me: https://rise4fun.com/Alive/JFNP https://rise4fun.com/Alive/jHvL These 4 folds with non-constants could be re-enabled, but for now let's go for the simplest solution. https://bugs.llvm.org/show_bug.cgi?id=42198 llvm-svn: 362911
* [InstCombine] change canonicalization to fabs() to use FMF on fnegSanjay Patel2019-06-091-13/+25
| | | | | | | | | | | | | | | | | This isn't the ideal fix (use FMF on the select), but it's still an improvement until we have better FMF propagation to selects and other FP math operators. I don't think there's much risk of regression from this change by not including the FMF on the fcmp any more. The nsz/nnan FMF should be the same on the fcmp and the fneg (fsub) because they have the same operand. This works around the most glaring FMF logical inconsistency cited in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 llvm-svn: 362909
* [GVN] non-functional code movementKeno Fischer2019-06-072-16/+16
| | | | | | | | | | | | Summary: Move some code around, in preparation for later fixes to the non-integral addrspace handling (D59661) Patch By Jameson Nash <jameson@juliacomputing.com> Reviewed By: reames, loladiro Differential Revision: https://reviews.llvm.org/D59729 llvm-svn: 362853
* [DomTreeUpdater] Add all insert before all delete updates to reduce compile ↵Alina Sbirlea2019-06-071-4/+10
| | | | | | | | | | | | | | | | | | | | | time. Summary: The cleanup in D62751 introduced a compile-time regression due to the way DT updates are performed. Add all insert edges then all delete edges in DTU to match the previous compile time. Compile time on the test provided by @mstorsjo before and after this patch on my machine: 113.046s vs 35.649s Repro: clang -target x86_64-w64-mingw32 -c -O3 glew-preproc.c; on https://martin.st/temp/glew-preproc.c. Reviewers: kuhar, NutshellySima, mstorsjo Subscribers: jlebar, mstorsjo, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62981 llvm-svn: 362839
* test-commitStefan Stipanovic2019-06-071-1/+0
| | | | llvm-svn: 362802
* [LV] Fix -Wunused-function after r362736Fangrui Song2019-06-071-0/+2
| | | | llvm-svn: 362762
* [LV] Wrap LV illegality reporting in a function. NFC.Renato Golin2019-06-061-100/+120
| | | | | | | | | | | | | | | | | | | | | | | A function for loop vectorization illegality reporting has been introduced: void LoopVectorizationLegality::reportVectorizationFailure( const StringRef DebugMsg, const StringRef OREMsg, const StringRef ORETag, Instruction * const I) const; The function prints a debug message when the debug for the compilation unit is enabled as well as invokes the optimization report emitter to generate a message with a specified tag. The function doesn't cover any complicated logic when a custom lambda should be passed to the emitter, only generating a message with a tag is supported. The function always prints the instruction `I` after the debug message whenever the instruction is specified, otherwise the debug message ends with a dot: 'LV: Not vectorizing: Disabled/already vectorized.' Patch by Pavel Samolysov <samolisov@gmail.com> llvm-svn: 362736
* [LoopPred] Fix a bug in unconditional latch bailout introduced in r362284Philip Reames2019-06-061-2/+2
| | | | | | This is a really silly bug that even a simple test w/an unconditional latch would have caught. I tried to guard against the case, but put it in the wrong if check. Oops. llvm-svn: 362727
* [MSAN] Add unary FNeg visitor to the MemorySanitizerCameron McInally2019-06-051-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D62909 llvm-svn: 362664
* [CallSite removal] Refactoring llvm::InlineFunction APIsMircea Trofin2019-06-051-8/+2
| | | | | | | | | | | | | | | | | | | | Summary: This change only unifies the API previous API pair accepting CallInst and InvokeInst, thus making it easier to refactor inliner pass ode to CallBase. The implementation of the unified API still relies on the CallSite implementation. Reviewers: eraman, chandlerc, jdoerfert Reviewed By: jdoerfert Subscribers: jdoerfert, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62283 llvm-svn: 362656
* [InstCombine] simplify code for bitcast of insertelement; NFCSanjay Patel2019-06-051-5/+4
| | | | llvm-svn: 362655
* NewGVN: Handle addrspacecastMatt Arsenault2019-06-051-2/+3
| | | | | | | | | | The AllConstant check needs to be moved out of the if/else if chain to avoid a test regression. The "there is no SimplifyZExt" comment puzzles me, since there is SimplifyCastInst. Additionally, the Simplify* calls seem to not see the operand as constant, so this needs to be tried if the simplify failed. llvm-svn: 362653
* InstCombine: correctly change byval type attribute alongside call args.Tim Northover2019-06-051-4/+20
| | | | | | | | When the byval attribute has a type, it must match the pointee type of any parameter; but InstCombine was not updating the attribute when folding casts of various kinds away. llvm-svn: 362643
* [SLP] Fix regression in broadcasts caused by operand reordering patch D59973.Dinar Temirbulatov2019-06-051-5/+35
| | | | | | | | | | | | This patch fixes a regression caused by the operand reordering refactoring patch https://reviews.llvm.org/D59973 . The fix changes the strategy to Splat instead of Opcode, if broadcast opportunities are found. Please see the lit test for some examples. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D62427 llvm-svn: 362613
* [LoopUtils][SLPVectorizer] clean up management of fast-math-flagsSanjay Patel2019-06-052-34/+30
| | | | | | | | | | | | | | | | Instead of passing around fast-math-flags as a parameter, we can set those using an IRBuilder guard object. This is no-functional-change-intended. The motivation is to eventually fix the vectorizers to use and set the correct fast-math-flags for reductions. Examples of that not behaving as expected are: https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast') https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0) D61802 (should be able to reduce with IR-level FMF) Differential Revision: https://reviews.llvm.org/D62272 llvm-svn: 362612
* [IPO] Disabled 'default only' switch statements to fix MSVC warnings.Simon Pilgrim2019-06-051-8/+8
| | | | | | @jdoerfert Looks like these are placeholders for incoming abstract attributes patches so I've just commented the code out, even though this is usually frowned upon. llvm-svn: 362592
* Resubmit "[CorrelatedValuePropagation] Fix prof branch_weights metadata ↵Yevgeny Rouban2019-06-051-56/+61
| | | | | | | | | | | | | | | | | handling for SwitchInst" This reverts commit 5b32f60ec31ce136edac6f693538aeb6039f4ad0. The fix is in commit 4f9e68148bd0dada2d6997625432385918ac2e2c. This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent. It makes use of SwitchInstProfUpdateWrapper. New tests are added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D62126 llvm-svn: 362583
* Suppress false-positive GCC -Wreturn-type warning.Michael Liao2019-06-051-0/+1
| | | | llvm-svn: 362582
* [Attributor] Pass infrastructure and fixpoint frameworkJohannes Doerfert2019-06-054-1/+541
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | NOTE: Note that no attributes are derived yet. This patch will not go in alone but only with others that derive attributes. The framework is split for review purposes. This commit introduces the Attributor pass infrastructure and fixpoint iteration framework. Further patches will introduce abstract attributes into this framework. In a nutshell, the Attributor will update instances of abstract arguments until a fixpoint, or a "timeout", is reached. Communication between the Attributor and the abstract attributes that are derived is restricted to the AbstractState and AbstractAttribute interfaces. Please see the file comment in Attributor.h for detailed information including design decisions and typical use case. Also consider the class documentation for Attributor, AbstractState, and AbstractAttribute. Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59918 llvm-svn: 362578
* [Scalarizer] Add UnaryOperator visitor to scalarization passCameron McInally2019-06-041-0/+38
| | | | | | Differential Revision: https://reviews.llvm.org/D62858 llvm-svn: 362558
* [Utils] Clean another duplicated util method.Alina Sbirlea2019-06-043-62/+13
| | | | | | | | | | | | | | | | | Summary: Following the cleanup in D48202, method foldBlockIntoPredecessor has the same behavior. Replace its uses with MergeBlockIntoPredecessor. Remove foldBlockIntoPredecessor. Reviewers: chandlerc, dmgreen Subscribers: jlebar, javed.absar, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62751 llvm-svn: 362538
* [SCCP] Add UnaryOperator visitor to SCCP for unary FNegCameron McInally2019-06-031-0/+26
| | | | | | Differential Revision: https://reviews.llvm.org/D62819 llvm-svn: 362449
* Fix a crash when the default of a switch is removedAndrew Kaylor2019-06-031-0/+5
| | | | | | | | This patch fixes a problem that occurs in LowerSwitch when a switch statement has a PHI node as its condition, and the PHI node only has two incoming blocks, and one of those incoming blocks is through an unreachable default in the switch statement. When this condition occurs, LowerSwitch holds a pointer to the condition value, but removes the switch block as a predecessor of the PHI block, causing the PHI node to be replaced. LowerSwitch then tries to use its stale pointer to the original condition value, causing a crash. Differential Revision: https://reviews.llvm.org/D62560 llvm-svn: 362427
* [LoopPred] Convert a second member function to a static helper [NFC]Philip Reames2019-06-031-14/+15
| | | | | | (And remember to actually mark the first one static.) llvm-svn: 362415
* [LoopPred] Convert member function to free helper function [NFC]Philip Reames2019-06-031-43/+47
| | | | llvm-svn: 362411
* [SimplifyIndVar] Refactor overflow check elimination code; NFCNikita Popov2019-06-011-97/+43
| | | | | | | | | | | Extract a willNotOverflow() helper function that is shared between eliminateOverflowIntrinsic() and strengthenOverflowingOperation(). Use WithOverflowInst for the former. We'll be able to reuse the same code for saturating intrinsics as well. llvm-svn: 362305
* [IndVarSimplify] Fixup nowrap flags during LFTR (PR31181)Nikita Popov2019-06-011-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | Fix for https://bugs.llvm.org/show_bug.cgi?id=31181 and partial fix for LFTR poison handling issues in general. When LFTR moves a condition from pre-inc to post-inc, it may now depend on value that is poison due to nowrap flags. To avoid this, we clear any nowrap flag that SCEV cannot prove for the post-inc addrec. Additionally, LFTR may switch to a different IV that is dynamically dead and as such may be arbitrarily poison. This patch will correct nowrap flags in some but not all cases where this happens. This is related to the adoption of IR nowrap flags for the pre-inc addrec. (See some of the switch_to_different_iv tests, where flags are not dropped or insufficiently dropped.) Finally, there are likely similar issues with the handling of GEP inbounds, but we don't have a test case for this yet. Differential Revision: https://reviews.llvm.org/D60935 llvm-svn: 362292
* Inline variable into assert to fix unused variable warning.Richard Trieu2019-06-011-3/+3
| | | | llvm-svn: 362285
* [LoopPred] Eliminate a redundant/confusing cover function [NFC]Philip Reames2019-06-011-19/+20
| | | | llvm-svn: 362284
* [LoopPred] Handle a subset of NE comparison based latchesPhilip Reames2019-06-011-21/+33
| | | | | | | | | | | | | | | | | | | | | At the moment, LoopPredication completely bails out if it sees a latch of the form: %cmp = icmp ne %iv, %N br i1 %cmp, label %loop, label %exit OR %cmp = icmp ne %iv.next, %NPlus1 br i1 %cmp, label %loop, label %exit This is unfortunate since this is exactly the form that LFTR likes to produce. So, go ahead and recognize simple cases where we can. For pre-increment loops, we leverage the fact that LFTR likes canonical counters (i.e. those starting at zero) and a (presumed) range fact on RHS to discharge the check trivially. For post-increment forms, the key insight is in remembering that LFTR had to insert a (N+1) for the RHS. CVP can hopefully prove that add nsw/nuw (if there's appropriate range on N to start with). This leaves us both with the post-inc IV and the RHS involving an nsw/nuw add, and SCEV can discharge that with no problem. This does still need to be extended to handle non-one steps, or other harder patterns of variable (but range restricted) starting values. That'll come later. Differential Revision: https://reviews.llvm.org/D62748 llvm-svn: 362282
* [SimplifyLibCalls] Fold more fortified functions into non-fortified variantsErik Pilkington2019-05-312-15/+203
| | | | | | | | | | | | When the object size argument is -1, no checking can be done, so calling the _chk variant is unnecessary. We already did this for a bunch of these functions. rdar://50797197 Differential revision: https://reviews.llvm.org/D62358 llvm-svn: 362272
* NFC: Pull out a function to reduce some duplicationErik Pilkington2019-05-312-119/+70
| | | | | | Part of https://reviews.llvm.org/D62358 llvm-svn: 362271
* Reapply [CVP] Simplify non-overflowing saturating add/subNikita Popov2019-05-311-1/+24
| | | | | | | | | | | | | If we can determine that a saturating add/sub will not overflow based on range analysis, convert it into a simple binary operation. This is a sibling transform to the existing with.overflow handling. Reapplying this with an additional check that the saturating intrinsic has integer type, as LVI currently does not support vector types. Differential Revision: https://reviews.llvm.org/D62703 llvm-svn: 362263
* [CVP] Fix assertion failure on vector with.overflowNikita Popov2019-05-311-1/+1
| | | | | | | | Noticed on D62703. LVI only handles plain integers, not vectors of integers. This was previously not an issue, because vector support for with.overflow is only a relatively recent addition. llvm-svn: 362261
* Revert "[CVP] Simplify non-overflowing saturating add/sub"Nikita Popov2019-05-311-24/+1
| | | | | | | | This reverts commit 1e692d1777ae34dcb93524b5798651a29defae09. Causes assertion failure in builtins-wasm.c clang test. llvm-svn: 362254
* [CVP] Simplify non-overflowing saturating add/subNikita Popov2019-05-311-1/+24
| | | | | | | | | | If we can determine that a saturating add/sub will not overflow based on range analysis, convert it into a simple binary operation. This is a sibling transform to the existing with.overflow handling. Differential Revision: https://reviews.llvm.org/D62703 llvm-svn: 362242
* [InstCombine] 'C-(C2-X) --> X+(C-C2)' constant-foldRoman Lebedev2019-05-311-1/+6
| | | | | | | | | | It looks this fold was already partially happening, indirectly via some other folds, but with one-use limitation. No other fold here has that restriction. https://rise4fun.com/Alive/ftR llvm-svn: 362217
OpenPOWER on IntegriCloud