summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Factor out UnrollAnalyzer to Analysis, and add unit tests for it.Michael Zolotukhin2016-02-081-239/+1
| | | | | | | | | | | | | | | | Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169
* rangify; NFCSanjay Patel2016-02-081-7/+5
| | | | llvm-svn: 260151
* [PGO] Differentiate Clang instrumentation and IR level instrumentation profilesRong Xu2016-02-081-0/+22
| | | | | | | | | | | | | | This patch uses one bit in profile version to differentiate Clang instrumentation and IR level instrumentation profiles. PGOInstrumenation generates a COMDAT variable __llvm_profile_raw_version so that the compiler runtime can set the right profile kind. PGOInstrumenation now checks this bit to make sure it's an IR level instrumentation profile. Differential Revision: http://reviews.llvm.org/D15540 llvm-svn: 260146
* fix typos; NFCSanjay Patel2016-02-081-17/+17
| | | | llvm-svn: 260130
* [PGO] Enable compression in pgo instrumentationXinliang David Li2016-02-081-9/+59
| | | | | | | | | | | | This reduces sizes of instrumented object files, final binaries, process images, and raw profile data. The format of the indexed profile data remain the same. Differential Revision: http://reviews.llvm.org/D16388 llvm-svn: 260117
* [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memorySilviu Baranga2016-02-082-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sanitizer issue. The PredicatedScalarEvolution's copy constructor wasn't copying the Generation value, and was leaving it un-initialized. Original commit message: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260112
* [JumpThreading] Change a return of ComputeValueKnownInPredecessors()Haicheng Wu2016-02-081-1/+1
| | | | | | | | | Change a return statement of ComputeValueKnownInPredecessors() to be the same as the rest return statements of the function. Otherwise, it might return true with an empty Result when the current basic block has no predecessors and trigger the first assert of JumpThreading::ProcessThreadableEdges(). llvm-svn: 260110
* [SLP] Fix placement of debug statement (NFC)Igor Breger2016-02-081-7/+7
| | | | | | | | By Ayal Zaks (ayal.zaks@intel.com) Differential Revision: http://reviews.llvm.org/D16976 llvm-svn: 260094
* Revert r260086 and r260085. They have broken the memorySilviu Baranga2016-02-082-1/+2
| | | | | | sanitizer bots. llvm-svn: 260087
* [LoopVersioning] Don't assert when there are no memchecksSilviu Baranga2016-02-081-1/+0
| | | | | | | | | | | | We shouldn't assert when there are no memchecks, since we can have SCEV checks. There is already an assert covering the case where there are no SCEV checks or memchecks. This also changes the LAA pointer wrapping versioning test to use the loop versioning pass (this was how I managed to trigger the assert in the loop versioning pass). llvm-svn: 260086
* [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided ↵Silviu Baranga2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260085
* [asan] Introduce new hidden -asan-use-private-alias option.Maxim Ostapenko2016-02-081-6/+44
| | | | | | | | | | | | | | | | As discussed in https://github.com/google/sanitizers/issues/398, with current implementation of poisoning globals we can have some CHECK failures or false positives in case of mixing instrumented and non-instrumented code due to ASan poisons innocent globals from non-sanitized binary/library. We can use private aliases to avoid such errors. In addition, to preserve ODR violation detection, we introduce new __odr_asan_gen_XXX symbol for each instrumented global that indicates if this global was already registered. To detect ODR violation in runtime, we should only check the value of indicator and report an error if it isn't equal to zero. Differential Revision: http://reviews.llvm.org/D15642 llvm-svn: 260075
* [X86][AVX512] add intrinsics of Scalar FP to integer conversion with ↵Asaf Badouh2016-02-071-4/+4
| | | | | | | | rounding mode Differential Revision: http://reviews.llvm.org/D16629 llvm-svn: 260033
* Don't use module context here. It's unnecessary and makes it harder to write ↵Daniel Berlin2016-02-071-2/+2
| | | | | | unittests llvm-svn: 260015
* Compute live-in for MemorySSADaniel Berlin2016-02-071-1/+41
| | | | llvm-svn: 260014
* Only insert into definingblocks once per blockDaniel Berlin2016-02-071-1/+4
| | | | llvm-svn: 260013
* New Loop Versioning LICM PassAshutosh Nema2016-02-064-0/+636
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 llvm-svn: 259986
* [LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible.Michael Zolotukhin2016-02-051-7/+56
| | | | | | | | | | | | | In r255133 (reapplied r253126) we started to avoid redundant recomputation of LCSSA after loop-unrolling. This patch moves one step further in this direction - now we can avoid it for much wider range of loops, as we start to look at IR and try to figure out if the transformation actually breaks LCSSA phis or makes it necessary to insert new ones. Differential Revision: http://reviews.llvm.org/D16838 llvm-svn: 259869
* [RS4GC] Pass DenseMap by reference, NFCJoseph Tremoulet2016-02-051-5/+4
| | | | | | | | | | | | | | | Summary: Passing the rematerialized values map to insertRematerializationStores by value looks to be a simple oversight; update it to pass by reference. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16911 llvm-svn: 259867
* [LoopLoadElim] Don't allow versioning when optForSizeAdam Nemet2016-02-051-2/+9
| | | | | | This was requested in the review of D16300. llvm-svn: 259861
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-041-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736
* [SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative toGerolf Hoflehner2016-02-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | D16251) Summary: This is a simpler fix to the problem than the dominator approach in http://reviews.llvm.org/D16251. It adds only values into the gather() while loop that have been seen before. The actual endless loop is in the constant compare gather() routine in Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the queue: %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i Here is what happens at the IR level: for.cond.i: ; preds = %if.end6.i, %if.end.i54 %ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ] %ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<< %cmp2.i = icmp ult i32 %ix.0.i, %11 br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit if.end6.i: ; preds = %for.body.i %cmp10.i = icmp ugt i32 %conv.i, %add9.i %.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<< When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i. The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i (Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i). And now there is use of .ret.0.off0.i before a definition which triggers the “endless” loop in gather(): while(!DFT.empty()) { V = DFT.pop_back_val(); // V is .ret.0.off0.i if (Instruction *I = dyn_cast<Instruction>(V)) { // If it is a || (or && depending on isEQ), process the operands. if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) { DFT.push_back(I->getOperand(1)); // This is now .ret.0.off0.i also DFT.push_back(I->getOperand(0)); continue; // “endless loop” for .ret.0.off0.i } Reviewers: reames, ahatanak Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16839 llvm-svn: 259730
* [InstrProfiling] Fix a comment (NFC)Vedant Kumar2016-02-031-1/+1
| | | | llvm-svn: 259727
* Minor code cleanups. NFC.Junmo Park2016-02-031-18/+18
| | | | llvm-svn: 259725
* [LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitchesDavid Majnemer2016-02-031-0/+11
| | | | | | | | | | Bail out if we have a PHI on an EHPad that gets a value from a CatchSwitchInst. Because the CatchSwitchInst cannot be split, there is no good place to stick any instructions. This fixes PR26373. llvm-svn: 259702
* Revert r259662, which caused regressions on polly tests.Wei Mi2016-02-031-16/+4
| | | | llvm-svn: 259675
* [InstCombine] Revert r238452: Fold IntToPtr and PtrToInt into preceding loads.Quentin Colombet2016-02-031-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | According to git bisect, this is the root cause of a miscompile for Regex in libLLVMSupport. I am still working on reducing a test case. The actual bug may be elsewhere and this commit just exposed it. Anyway, at the moment, to reproduce, follow these steps: 1. Build clang and libLTO in release mode. 2. Create a new build directory <stage2> and cd into it. 3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts using -O2 -flto 4. Run llvm-extract -ralias '.*bar' -S test/Other/extract-alias.ll Result: program doesn't contain global named '.*bar'! Expected result: @a0a0bar = alias void ()* @bar @a0bar = alias void ()* @bar declare void @bar() Note: In step #3, if you don't use lto or asserts, the miscompile disappears. llvm-svn: 259674
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-031-4/+16
| | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259662
* LowerBitSets: Don't bother to do any work if the llvm.bitset.test intrinsic ↵Peter Collingbourne2016-02-031-1/+1
| | | | | | is unused. llvm-svn: 259625
* Add #include "llvm/Support/raw_ostream.h" to fix Windows build.Peter Collingbourne2016-02-031-0/+1
| | | | llvm-svn: 259623
* Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused.Peter Collingbourne2016-02-033-655/+598
| | | | llvm-svn: 259621
* [LoopVersioning] Expose loop versioning as a pass tooAdam Nemet2016-02-032-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: LoopVersioning is a transform utility that transform passes can use to run-time disambiguate may-aliasing accesses. I'd like to also expose as pass to allow it to be unit-tested. I am planning to add support for non-aliasing annotation in LoopVersioning and I'd like to be able to write tests directly using this pass. (After that feature is done, the pass could also be used to look for optimization opportunities that are hidden behind incomplete alias information at compile time.) The pass drives LoopVersioning in its default way which is to fully disambiguate may-aliasing accesses no matter how many checks are required. Reviewers: hfinkel, ashutosh.nema, sbaranga Subscribers: zzheng, mssimpso, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D16612 llvm-svn: 259610
* Attempt #2 to unbreak r259595.George Burgess IV2016-02-021-4/+4
| | | | llvm-svn: 259602
* Attempt to fix builds broken by r259595.George Burgess IV2016-02-021-1/+1
| | | | llvm-svn: 259599
* This patch adds MemorySSA to LLVM.George Burgess IV2016-02-023-0/+942
| | | | | | | | | Please see include/llvm/Transforms/Utils/MemorySSA.h for a description of MemorySSA, and what it does. Differential Revision: http://reviews.llvm.org/D7864 llvm-svn: 259595
* [asan] Add iOS support to AddressSanitzierAnna Zaks2016-02-021-3/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D15625 llvm-svn: 259586
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-022-3/+0
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* function names start with a lowercase letter; NFCSanjay Patel2016-02-0114-324/+324
| | | | llvm-svn: 259425
* [InstCombine] simplify masked scatter/gather intrinsics with zero masksSanjay Patel2016-02-011-4/+22
| | | | | | | | | | | A masked scatter with a zero mask means there's no store. A masked gather with a zero mask means the passthru arg is returned. This is a continuation of: http://reviews.llvm.org/rL259369 http://reviews.llvm.org/rL259392 llvm-svn: 259421
* [InstCombine] simplify masked store intrinsics with all ones or zeros masksSanjay Patel2016-02-011-1/+21
| | | | | | | | | | A masked store with a zero mask means there's no store. A masked store with an allOnes mask means it's a normal vector store. This is a continuation of: http://reviews.llvm.org/rL259369 llvm-svn: 259392
* [InstCombine] Don't transform (X+INT_MAX)>=(Y+INT_MAX) -> (X<=Y)David Majnemer2016-02-011-1/+1
| | | | | | | | | This miscompile came about because we tried to use a transform which was only appropriate for xor operators when addition was present. This fixes PR26407. llvm-svn: 259375
* [InstCombine] simplify masked load intrinsics with all ones or zeros masksSanjay Patel2016-02-011-0/+30
| | | | | | | | | A masked load with a zero mask means there's no load. A masked load with an allOnes mask means it's a normal vector load. Differential Revision: http://reviews.llvm.org/D16691 llvm-svn: 259369
* [LV] Rename RdxPHIsToFix to PHIsToFix (NFC)Matthew Simpson2016-02-011-39/+32
| | | | | | | | | | In the future, we will vectorize recurrences other than reductions. This patch renames a few variables and updates their associated comments to enable them to be reused for non-reduction PHI nodes. This change was requested in the review for D16197. llvm-svn: 259364
* Reapply commit r258404 with fix.Matthew Simpson2016-02-011-11/+231
| | | | | | | The previous patch caused PR26364. The fix is to ensure that we don't enter a cycle when iterating over use-def chains. llvm-svn: 259357
* add helper function for minnum/maxnum ; NFCSanjay Patel2016-01-311-74/+80
| | | | llvm-svn: 259326
* use range-based for loop; NFCSanjay Patel2016-01-311-3/+3
| | | | llvm-svn: 259325
* fix formatting; NFCSanjay Patel2016-01-311-13/+13
| | | | llvm-svn: 259324
* simplify; NFCSanjay Patel2016-01-311-8/+5
| | | | llvm-svn: 259323
* Convert int to Twine instead of using utostr since it was already being ↵Craig Topper2016-01-311-1/+1
| | | | | | added to a Twine. NFC llvm-svn: 259308
* InstCombine: fabs(x) * fabs(x) -> x * xMatt Arsenault2016-01-301-4/+15
| | | | llvm-svn: 259295
OpenPOWER on IntegriCloud