summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Analysis
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopAccesses] Split out LoopAccessReport from VectorizerReportAdam Nemet2015-02-181-18/+18
| | | | | | | | | | | | The only difference between these two is that VectorizerReport adds a vectorizer-specific prefix to its messages. When LAA is used in the vectorizer context the prefix is added when we promote the LoopAccessReport into a VectorizerReport via one of the constructors. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229632
* [LoopAccesses] Add missing const to APIs in VectorizationReportAdam Nemet2015-02-181-2/+2
| | | | | | | | | | When I split out LoopAccessReport from this, I need to create some temps so constness becomes necessary. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229631
* [LoopAccesses] Add canAnalyzeLoopAdam Nemet2015-02-181-1/+51
| | | | | | | | | | | This allows the analysis to be attempted with any loop. This feature will be used with -analysis. (LV only requests the analysis on loops that have already satisfied these tests.) This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229630
* [LoopAccesses] Factor out RuntimePointerCheck::needsCheckingAdam Nemet2015-02-181-9/+18
| | | | | | | | | Will be used by the new RuntimePointerCheck::print. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229629
* [LoopAccesses] Change debug messages from LV to LAAAdam Nemet2015-02-181-38/+39
| | | | | | | | | Also add pass name as an argument to VectorizationReport::emitAnalysis. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229628
* [LoopAccesses] Create the analysis passAdam Nemet2015-02-181-0/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a function pass that runs the analysis on demand. The analysis can be initiated by querying the loop access info via LAA::getInfo. It either returns the cached info or runs the analysis. Symbolic stride information continues to reside outside of this analysis pass. We may move it inside later but it's not a priority for me right now. The idea is that Loop Distribution won't support run-time stride checking at least initially. This means that when querying the analysis, symbolic stride information can be provided optionally. Whether stride information is used can invalidate the cache entry and rerun the analysis. Note that if the loop does not have any symbolic stride, the entry should be preserved across Loop Distribution and LV. Since currently the only user of the pass is LV, I just check that the symbolic stride information didn't change when using a cached result. On the LV side, LoopVectorizationLegality requests the info object corresponding to the loop from the analysis pass. A large chunk of the diff is due to LAI becoming a pointer from a reference. A test will be added as part of the -analyze patch. Also tested that with AVX, we generate identical assembly output for the testsuite (including the external testsuite) before and after. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229626
* [LoopAccesses] Make blockNeedsPredication staticAdam Nemet2015-02-181-3/+4
| | | | | | | | | | | | | | | | blockNeedsPredication is in LoopAccess in order to share it with the vectorizer. It's a utility needed by LoopAccess not strictly provided by it but it's a good place to share it. This makes the function static so that it no longer required to create an LoopAccessInfo instance in order to access it from LV. This was actually causing problems because it would have required creating LAI much earlier that LV::canVectorizeMemory(). This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229625
* [LoopAccesses] Cache the result of canVectorizeMemoryAdam Nemet2015-02-181-13/+20
| | | | | | | | | | | | LAA will be an on-demand analysis pass, so we need to cache the result of the analysis. canVectorizeMemory is renamed to analyzeLoop which computes the result. canVectorizeMemory becomes the query function for the cached result. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229624
* [LoopAccesses] Stash the report from the analysis rather than emitting itAdam Nemet2015-02-181-1/+2
| | | | | | | | | | | The transformation passes will query this and then emit them as part of their own report. The currently only user LV is modified to do just that. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229623
* [LoopAccesses] Make VectorizerParams globalAdam Nemet2015-02-181-16/+16
| | | | | | | | | | | | | | As LAA is becoming a pass, we can no longer pass the params to its constructor. This changes the command line flags to have external storage. These can now be accessed both from LV and LAA. VectorizerParams is moved out of LoopAccessInfo in order to shorten the code to access it. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229622
* [LoopAccesses] Rename LoopAccessAnalysis to LoopAccessInfoAdam Nemet2015-02-181-15/+14
| | | | | | | | | LoopAccessAnalysis will be used as the name of the pass. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. llvm-svn: 229621
* Generalize getExtendAddRecStart to work with both sign and zeroSanjoy Das2015-02-181-143/+218
| | | | | | | | | | | extensions. This change also removes `DEBUG(dbgs() << "SCEV: untested prestart overflow check\n");` because that case has a unit test now. Differential Revision: http://reviews.llvm.org/D7645 llvm-svn: 229600
* Bugfix: SCEV incorrectly marks certain expressions as nswSanjoy Das2015-02-181-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I could not come up with a test case for this one; but I don't think `getPreStartForSignExtend` can assume `AR` is `nsw` -- there is one place in scalar evolution that calls `getSignExtendAddRecStart(AR, ...)` without proving that `AR` is `nsw` (line 1564) OperandExtendedAdd = getAddExpr(WideStart, getMulExpr(WideMaxBECount, getZeroExtendExpr(Step, WideTy))); if (SAdd == OperandExtendedAdd) { // If AR wraps around then // // abs(Step) * MaxBECount > unsigned-max(AR->getType()) // => SAdd != OperandExtendedAdd // // Thus (AR is not NW => SAdd != OperandExtendedAdd) <=> // (SAdd == OperandExtendedAdd => AR is NW) const_cast<SCEVAddRecExpr *>(AR)->setNoWrapFlags(SCEV::FlagNW); // Return the expression with the addrec on the outside. return getAddRecExpr(getSignExtendAddRecStart(AR, Ty, this), getZeroExtendExpr(Step, Ty), L, AR->getNoWrapFlags()); } Differential Revision: http://reviews.llvm.org/D7640 llvm-svn: 229594
* Prefer SmallVector::append/insert over push_back loops.Benjamin Kramer2015-02-171-2/+1
| | | | | | Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500
* Revert 229175Philip Reames2015-02-151-3/+1
| | | | | | This change is a logical suspect in 22587 and 22590. Given it's of minimal importanance and I can't get clang to build on my home machine, I'm reverting so that I can deal with this next week. llvm-svn: 229322
* Unify the two EH personality classification routines I wroteReid Kleckner2015-02-141-2/+2
| | | | | | We only need one. llvm-svn: 229193
* Analysis: Canonicalize access to function attributes, NFCDuncan P. N. Exon Smith2015-02-142-10/+6
| | | | | | | | | | | | Canonicalize access to function attributes to use the simpler API. getAttributes().getAttribute(AttributeSet::FunctionIndex, Kind) => getFnAttribute(Kind) getAttributes().hasAttribute(AttributeSet::FunctionIndex, Kind) => hasFnAttribute(Kind) llvm-svn: 229192
* Minor tweak to MDAPhilip Reames2015-02-131-1/+3
| | | | | | | | | | Two minor tweaks I noticed when reading through the code: - No need to recompute begin() on every iteration. We're not modifying the instructions in this loop. - We can ignore PHINodes and Dbg intrinsics. The current code does this anyways, but it will spend slightly more time doing so and will count towards the limit of instructions in the block. It seems really silly to give up due the presence of PHIs... Differential Revision: http://reviews.llvm.org/D7624 llvm-svn: 229175
* [PM] Remove the old 'PassManager.h' header file at the top level ofChandler Carruth2015-02-131-3/+3
| | | | | | | | | | | | | | | | | | | | LLVM's include tree and the use of using declarations to hide the 'legacy' namespace for the old pass manager. This undoes the primary modules-hostile change I made to keep out-of-tree targets building. I sent an email inquiring about whether this would be reasonable to do at this phase and people seemed fine with it, so making it a reality. This should allow us to start bootstrapping with modules to a certain extent along with making it easier to mix and match headers in general. The updates to any code for users of LLVM are very mechanical. Switch from including "llvm/PassManager.h" to "llvm/IR/LegacyPassManager.h". Qualify the types which now produce compile errors with "legacy::". The most common ones are "PassManager", "PassManagerBase", and "FunctionPassManager". llvm-svn: 229094
* Re-sort #include lines using my handy dandy ./utils/sort_includes.pyChandler Carruth2015-02-131-1/+1
| | | | | | script. This is in preparation for changes to lots of include lines. llvm-svn: 229088
* InstCombine: cleanup redundant dyn_cast<> (NFC)Mehdi Amini2015-02-131-44/+43
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 229075
* Fix a crash in the assumption cache when inlining indirect function callsBjorn Steinbrink2015-02-121-6/+6
| | | | | | | | | | | | | | | | | Summary: Instances of the AssumptionCache are per function, so we can't re-use the same AssumptionCache instance when recursing in the CallAnalyzer to analyze a different function. Instead we have to pass the AssumptionCacheTracker to the CallAnalyzer so it can get the right AssumptionCache on demand. Reviewers: hfinkel Subscribers: llvm-commits, hans Differential Revision: http://reviews.llvm.org/D7533 llvm-svn: 228957
* Fixed a bug where CFLAA would crash the compiler.George Burgess IV2015-02-121-6/+13
| | | | | | | | We would crash if we couldn't locate a Function that either Location's Value belonged to. Now we just print out a debug message and return conservatively. llvm-svn: 228901
* Use ADDITIONAL_HEADER_DIRS in all LLVM CMake projects.Zachary Turner2015-02-111-0/+3
| | | | | | | | | | This allows IDEs to recognize the entire set of header files for each of the core LLVM projects. Differential Revision: http://reviews.llvm.org/D7526 Reviewed By: Chris Bieneman llvm-svn: 228798
* Don't promote asynch EH invokes of nounwind functions to callsReid Kleckner2015-02-111-1/+24
| | | | | | | | | | | If the landingpad of the invoke is using a personality function that catches asynch exceptions, then it can catch a trap. Also add some landingpads to invalid LLVM IR test cases that lack them. Over-the-shoulder reviewed by David Majnemer. llvm-svn: 228782
* Adding support for llvm.eh.begincatch and llvm.eh.endcatch intrinsics and ↵Andrew Kaylor2015-02-101-0/+192
| | | | | | | | beginning the documentation of native Windows exception handling. Differential Revision: http://reviews.llvm.org/D7398 llvm-svn: 228733
* MemDerefPrinter: Require DataLayoutPass for higher accuracyRamkumar Ramachandra2015-02-091-3/+12
| | | | | | | | | | Without a valid data layout, deferenceable(N) doesn't get parsed or propagated. Since this is the key item we are testing, add a dependency on the pass. Differential Revision: http://reviews.llvm.org/D7508 llvm-svn: 228611
* MemDepPrinter: cleanup a few loops (NFC)Ramkumar Ramachandra2015-02-091-9/+8
| | | | | | | | | Make use of the newly introduced inst_range to clean up two loops. Clean up a third one while at it. Differential Revision: http://reviews.llvm.org/D7455 llvm-svn: 228596
* Bugfix: SCEV incorrectly marks certain add recurrences as nswSanjoy Das2015-02-091-2/+10
| | | | | | | | | | | | | | When creating a scev for sext({X,+,Y}), scev checks if the expression is equivalent to {sext X,+,zext Y}. If it can prove that, it also tags the original {X,+,Y} as <nsw>, which is not correct. In the test case I run `-scalar-evolution` twice because the bug manifests only once SCEV has run through and seen the `sext` expressions (and then does a in-place mutation on {X,+,Y}). Differential Revision: http://reviews.llvm.org/D7495 llvm-svn: 228586
* Allow ScalarEvolution to catch more min/max casesJohannes Doerfert2015-02-091-23/+25
| | | | | | | | | | | | For the attached test case different types are used in the ICmpInst and SelectInst that represent the min/max expressions. However, if the ICmpInst type is smaller a comparison with the sign/zero extended operands would have yielded the same result. This situation might arise after the instruction combination pass was applied. Differential Revision: http://reviews.llvm.org/D7338 llvm-svn: 228572
* Bugfix: ScalarEvolution incorrectly assumes that the start of certainSanjoy Das2015-02-081-1/+18
| | | | | | | | | | | | | add recurrences don't overflow. This change makes the optimization more restrictive. It still assumes that an overflowing `add nsw` is undefined behavior; and this change will need revisiting once we have a consistent semantics for poison values. Differential Revision: http://reviews.llvm.org/D7331 llvm-svn: 228552
* Correctly combine alias.scope metadata by a union instead of intersectingBjorn Steinbrink2015-02-081-2/+2
| | | | | | | | | | | | | | | | | Summary: The alias.scope metadata represents sets of things an instruction might alias with. When generically combining the metadata from two instructions the result must be the union of the original sets, because the new instruction might alias with anything any of the original instructions aliased with. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7490 llvm-svn: 228525
* ValueTracking: Make isBytewiseValue simpler and more powerful at the same time.Benjamin Kramer2015-02-071-19/+9
| | | | | | | Turns out there is a simpler way of checking that all bytes in a word are equal than binary decomposition. llvm-svn: 228503
* [BasicAA] Try to disambiguate GEPs through arrays of structs intoAhmed Bougacha2015-02-071-0/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | different fields. We can show that two GEPs off of the same (possibly multidimensional) array of structs, into different fields, can't alias. Quoting: For two GEPOperators GEP1 and GEP2, if we find that: - both GEPs begin indexing from the exact same pointer; - the last indices in both GEPs are constants, indexing into a struct; - said indices are different, hence,the pointed-to fields are different; - and both GEPs only index through arrays prior to that; this lets us determine that the struct that GEP1 indexes into and the struct that GEP2 indexes into must either precisely overlap or be completely disjoint. Because they cannot partially overlap, indexing into different non-overlapping fields of the struct will never alias. The other BasicAA::aliasGEP rules worked in some cases, but not all (for example, the i32x3 struct in the testcase). We can add this simple ad-hoc rule to complement them. rdar://19717375 Differential Revision: http://reviews.llvm.org/D7453 llvm-svn: 228498
* SCEV: Compress disposition pairs.Benjamin Kramer2015-02-071-18/+18
| | | | | | | Composing DenseMaps and SmallVectors is still somewhat suboptimal, but this at least halves the size of the vector elements. NFC. llvm-svn: 228497
* [InstSimplify] Add SimplifyFPBinOp function.Michael Zolotukhin2015-02-062-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | It is a variation of SimplifyBinOp, but it takes into account FastMathFlags. It is needed in inliner and loop-unroller to accurately predict the transformation's outcome (previously we dropped the flags and were too conservative in some cases). Example: float foo(float *a, float b) { float r; if (a[1] * b) r = /* a lot of expensive computations */; else r = 1; return r; } float boo(float *a) { return foo(a, 0.0); } Without this patch, we don't inline 'foo' into 'boo'. llvm-svn: 228432
* [LV] Move addRuntimeCheck to LoopAccessAnalysisAdam Nemet2015-02-061-0/+106
| | | | | | | | | | | | | This will allow it to be shared with the new Loop Distribution pass. getFirstInst is currently duplicated across LoopVectorize.cpp and LoopAccessAnalysis.cpp. This is a short-term work-around until we figure out a better solution. NFC. (The code moved is adjusted a bit for the name of the Loop member and that PtrRtCheck is now a reference rather than a pointer.) llvm-svn: 228418
* Whitespace.Chad Rosier2015-02-061-2/+0
| | | | llvm-svn: 228397
* Introduce print-memderefs to test isDereferenceablePointerRamkumar Ramachandra2015-02-063-0/+63
| | | | | | | | | | Since testing the function indirectly is tricky, introduce a direct print-memderefs pass, in the same spirit as print-memdeps, which prints dereferenceability information matched by FileCheck. Differential Revision: http://reviews.llvm.org/D7075 llvm-svn: 228369
* Value soft float calls as more expensive in the inliner.Cameron Esfahani2015-02-052-0/+23
| | | | | | | | | | | | | | Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 llvm-svn: 228263
* ValueTracking: Make isSafeToSpeculativelyExecute a little cleanerDavid Majnemer2015-02-011-14/+14
| | | | | | No functional change intended. llvm-svn: 227760
* [LoopVectorize] Move LoopAccessAnalysis to its own moduleAdam Nemet2015-02-012-0/+1085
| | | | | | | | | | | | | | | | | | | | Other than moving code and adding the boilerplate for the new files, the code being moved is unchanged. There are a few global functions that are shared with the rest of the LoopVectorizer. I moved these to the new module as well (emitLoopAnalysis, stripIntegerCast, replaceSymbolicStrideSCEV) along with the Report class used by emitLoopAnalysis. There is probably room for further improvement in this area. I kept DEBUG_TYPE "loop-vectorize" because it's used as the PassName with emitOptimizationRemarkAnalysis. This will obviously have to change. NFC. This is part of the patchset that splits out the memory dependence logic from LoopVectorizationLegality into a new class LoopAccessAnalysis. LoopAccessAnalysis will be used by the new Loop Distribution pass. llvm-svn: 227756
* [multiversion] Kill FunctionTargetTransformInfo, TTI itself is nowChandler Carruth2015-02-012-51/+0
| | | | | | per-function and supports the exact desired interface. llvm-svn: 227743
* [multiversion] Remove the function parameter from the unrollingChandler Carruth2015-02-011-2/+2
| | | | | | preferences interface on TTI now that all of TTI is per-function. llvm-svn: 227741
* [multiversion] Implement the old pass manager's TTI wrapper pass inChandler Carruth2015-02-011-5/+10
| | | | | | | | | | | | | | | | | | | | | | | terms of the new pass manager's TargetIRAnalysis. Yep, this is one of the nicer bits of the new pass manager's design. Passes can in many cases operate in a vacuum and so we can just nest things when convenient. This is particularly convenient here as I can now consolidate all of the TargetMachine logic on this analysis. The most important change here is that this pushes the function we need TTI for all the way into the TargetMachine, and re-creates the TTI object for each function rather than re-using it for each function. We're now prepared to teach the targets to produce function-specific TTI objects with specific subtargets cached, etc. One piece of feedback I'd love here is whether its worth renaming any of this stuff. None of the names really seem that awesome to me at this point, but TargetTransformInfoWrapperPass is particularly ... odd. TargetIRAnalysisWrapper might make more sense. I would want to do that rename separately anyways, but let me know what you think. llvm-svn: 227731
* [multiversion] Thread a function argument through all the callers of theChandler Carruth2015-02-013-4/+4
| | | | | | | | | | | | | | getTTI method used to get an actual TTI object. No functionality changed. This just threads the argument and ensures code like the inliner can correctly look up the callee's TTI rather than using a fixed one. The next change will use this to implement per-function subtarget usage by TTI. The changes after that should eliminate the need for FTTI as that will have become the default. llvm-svn: 227730
* [PM] Port TTI to the new pass manager, introducing a TargetIRAnalysis toChandler Carruth2015-02-011-0/+17
| | | | | | | | | | | | | | produce it. This adds a function to the TargetMachine that produces this analysis via a callback for each function. This in turn faves the way to produce a *different* TTI per-function with the correct subtarget cached. I've also done the necessary wiring in the opt tool to thread the target machine down and make it available to the pass registry so that we can construct this analysis from a target machine when available. llvm-svn: 227721
* [PM] Switch the TargetMachine interface from accepting a pass managerChandler Carruth2015-01-311-13/+17
| | | | | | | | | | | | | | | | | | | | | | | base which it adds a single analysis pass to, to instead return the type erased TargetTransformInfo object constructed for that TargetMachine. This removes all of the pass variants for TTI. There is now a single TTI *pass* in the Analysis layer. All of the Analysis <-> Target communication is through the TTI's type erased interface itself. While the diff is large here, it is nothing more that code motion to make types available in a header file for use in a different source file within each target. I've tried to keep all the doxygen comments and file boilerplate in line with this move, but let me know if I missed anything. With this in place, the next step to making TTI work with the new pass manager is to introduce a really simple new-style analysis that produces a TTI object via a callback into this routine on the target machine. Once we have that, we'll have the building blocks necessary to accept a function argument as well. llvm-svn: 227685
* [PM] Change the core design of the TTI analysis to use a polymorphicChandler Carruth2015-01-315-527/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | type erased interface and a single analysis pass rather than an extremely complex analysis group. The end result is that the TTI analysis can contain a type erased implementation that supports the polymorphic TTI interface. We can build one from a target-specific implementation or from a dummy one in the IR. I've also factored all of the code into "mix-in"-able base classes, including CRTP base classes to facilitate calling back up to the most specialized form when delegating horizontally across the surface. These aren't as clean as I would like and I'm planning to work on cleaning some of this up, but I wanted to start by putting into the right form. There are a number of reasons for this change, and this particular design. The first and foremost reason is that an analysis group is complete overkill, and the chaining delegation strategy was so opaque, confusing, and high overhead that TTI was suffering greatly for it. Several of the TTI functions had failed to be implemented in all places because of the chaining-based delegation making there be no checking of this. A few other functions were implemented with incorrect delegation. The message to me was very clear working on this -- the delegation and analysis group structure was too confusing to be useful here. The other reason of course is that this is *much* more natural fit for the new pass manager. This will lay the ground work for a type-erased per-function info object that can look up the correct subtarget and even cache it. Yet another benefit is that this will significantly simplify the interaction of the pass managers and the TargetMachine. See the future work below. The downside of this change is that it is very, very verbose. I'm going to work to improve that, but it is somewhat an implementation necessity in C++ to do type erasure. =/ I discussed this design really extensively with Eric and Hal prior to going down this path, and afterward showed them the result. No one was really thrilled with it, but there doesn't seem to be a substantially better alternative. Using a base class and virtual method dispatch would make the code much shorter, but as discussed in the update to the programmer's manual and elsewhere, a polymorphic interface feels like the more principled approach even if this is perhaps the least compelling example of it. ;] Ultimately, there is still a lot more to be done here, but this was the huge chunk that I couldn't really split things out of because this was the interface change to TTI. I've tried to minimize all the other parts of this. The follow up work should include at least: 1) Improving the TargetMachine interface by having it directly return a TTI object. Because we have a non-pass object with value semantics and an internal type erasure mechanism, we can narrow the interface of the TargetMachine to *just* do what we need: build and return a TTI object that we can then insert into the pass pipeline. 2) Make the TTI object be fully specialized for a particular function. This will include splitting off a minimal form of it which is sufficient for the inliner and the old pass manager. 3) Add a new pass manager analysis which produces TTI objects from the target machine for each function. This may actually be done as part of #2 in order to use the new analysis to implement #2. 4) Work on narrowing the API between TTI and the targets so that it is easier to understand and less verbose to type erase. 5) Work on narrowing the API between TTI and its clients so that it is easier to understand and less verbose to forward. 6) Try to improve the CRTP-based delegation. I feel like this code is just a bit messy and exacerbating the complexity of implementing the TTI in each target. Many thanks to Eric and Hal for their help here. I ended up blocked on this somewhat more abruptly than I expected, and so I appreciate getting it sorted out very quickly. Differential Revision: http://reviews.llvm.org/D7293 llvm-svn: 227669
* Fold fcmp in cases where value is provably non-negative. By Arch Robison.Elena Demikhovsky2015-01-282-0/+67
| | | | | | | | | | | | This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where: - the predicate is olt or uge - the first operand is provably not less than zero - the second operand is zero The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(x*x+y*y).. http://reviews.llvm.org/D6972 llvm-svn: 227298
OpenPOWER on IntegriCloud