summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* StructurizeCFG: Initialize SkipUniformRegions in the default constructorTom Stellard2016-02-101-1/+1
| | | | | | This should fix some random bot failures caused by r260336. llvm-svn: 260342
* StructurizeCFG: Add an option for skipping regions with only uniform branchesTom Stellard2016-02-101-3/+38
| | | | | | | | | | | | | | Summary: Tests for this will be added once the AMDGPU backend enables this option. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16602 llvm-svn: 260336
* Add convergent-removing bits to FunctionAttrs pass.Justin Lebar2016-02-091-0/+63
| | | | | | | | | | | | | | | | | Summary: Remove the convergent attribute on any functions which provably do not contain or invoke any convergent functions. After this change, we'll be able to modify clang to conservatively add 'convergent' to all functions when compiling CUDA. Reviewers: jingyue, joker.eph Subscribers: llvm-commits, tra, jhen, hfinkel, resistor, chandlerc, arsenm Differential Revision: http://reviews.llvm.org/D17013 llvm-svn: 260319
* Fix GCC build.Peter Collingbourne2016-02-091-0/+4
| | | | llvm-svn: 260317
* WholeProgramDevirt: introduce.Peter Collingbourne2016-02-094-0/+739
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass implements whole program optimization of virtual calls in cases where we know (via bitset information) that the list of callees is fixed. This includes the following: - Single implementation devirtualization: if a virtual call has a single possible callee, replace all calls with a direct call to that callee. - Virtual constant propagation: if the virtual function's return type is an integer <=64 bits and all possible callees are readnone, for each class and each list of constant arguments: evaluate the function, store the return value alongside the virtual table, and rewrite each virtual call as a load from the virtual table. - Uniform return value optimization: if the conditions for virtual constant propagation hold and each function returns the same constant value, replace each virtual call with that constant. - Unique return value optimization for i1 return values: if the conditions for virtual constant propagation hold and a single vtable's function returns 0, or a single vtable's function returns 1, replace each virtual call with a comparison of the vptr against that vtable's address. Differential Revision: http://reviews.llvm.org/D16795 llvm-svn: 260312
* [InstCombine][GC] Handle gc.relocations of vector typePhilip Reames2016-02-091-25/+22
| | | | | | | | We introduced gc.relocates of vector-of-pointer types a couple of weeks back. Somehow, I missed updating the InstCombine rule to account for this. If we hit this code path with a vector-of-pointers gc.relocate, we'd crash on a cast<PointerType>. I also took the chance to do a bit of code style cleanup. llvm-svn: 260279
* [FunctionAttrs] Fix SCC logic around operand bundlesSanjoy Das2016-02-091-2/+6
| | | | | | | | | | | FunctionAttrs does an "optimistic" analysis of SCCs as a unit, which means normally it is able to disregard calls from an SCC into itself. However, calls and invokes with operand bundles are allowed to have memory effects not fully described by the memory effects on the call target, so we can't be optimistic around operand-bundled calls from an SCC into itself. llvm-svn: 260244
* Add an "addUsedAAAnalyses" helper functionSanjoy Das2016-02-093-0/+3
| | | | | | | | | | | | | | | | | | | Summary: Passes that call `getAnalysisIfAvailable<T>` also need to call `addUsedIfAvailable<T>` in `getAnalysisUsage` to indicate to the legacy pass manager that it uses `T`. This contract was being violated by passes that used `createLegacyPMAAResults`. This change fixes this by exposing a helper in AliasAnalysis.h, `addUsedAAAnalyses`, that is complementary to createLegacyPMAAResults and does the right thing when called from `getAnalysisUsage`. Reviewers: chandlerc Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17010 llvm-svn: 260183
* [PGO] Revert r260146 as it breaks Darwin platforms.Rong Xu2016-02-081-22/+0
| | | | | | | r260146 | xur | 2016-02-08 13:07:46 -0800 (Mon, 08 Feb 2016) | 13 lines [PGO] Differentiate Clang instrumentation and IR level instrumentation profiles llvm-svn: 260170
* Factor out UnrollAnalyzer to Analysis, and add unit tests for it.Michael Zolotukhin2016-02-081-239/+1
| | | | | | | | | | | | | | | | Summary: Unrolling Analyzer is already pretty complicated, and it becomes harder and harder to exercise it with usual IR tests, as with them we can only check the final decision: whether the loop is unrolled or not. This change factors this framework out from LoopUnrollPass to analyses, which allows to use unit tests. The change itself is supposed to be NFC, except adding a couple of tests. I plan to add more tests as I add new functionality and find/fix bugs. Reviewers: chandlerc, hfinkel, sanjoy Subscribers: zzheng, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D16623 llvm-svn: 260169
* rangify; NFCSanjay Patel2016-02-081-7/+5
| | | | llvm-svn: 260151
* [PGO] Differentiate Clang instrumentation and IR level instrumentation profilesRong Xu2016-02-081-0/+22
| | | | | | | | | | | | | | This patch uses one bit in profile version to differentiate Clang instrumentation and IR level instrumentation profiles. PGOInstrumenation generates a COMDAT variable __llvm_profile_raw_version so that the compiler runtime can set the right profile kind. PGOInstrumenation now checks this bit to make sure it's an IR level instrumentation profile. Differential Revision: http://reviews.llvm.org/D15540 llvm-svn: 260146
* fix typos; NFCSanjay Patel2016-02-081-17/+17
| | | | llvm-svn: 260130
* [PGO] Enable compression in pgo instrumentationXinliang David Li2016-02-081-9/+59
| | | | | | | | | | | | This reduces sizes of instrumented object files, final binaries, process images, and raw profile data. The format of the indexed profile data remain the same. Differential Revision: http://reviews.llvm.org/D16388 llvm-svn: 260117
* [SCEV][LAA] Re-commit r260085 and r260086, this time with a fix for the memorySilviu Baranga2016-02-082-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | sanitizer issue. The PredicatedScalarEvolution's copy constructor wasn't copying the Generation value, and was leaving it un-initialized. Original commit message: [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260112
* [JumpThreading] Change a return of ComputeValueKnownInPredecessors()Haicheng Wu2016-02-081-1/+1
| | | | | | | | | Change a return statement of ComputeValueKnownInPredecessors() to be the same as the rest return statements of the function. Otherwise, it might return true with an empty Result when the current basic block has no predecessors and trigger the first assert of JumpThreading::ProcessThreadableEdges(). llvm-svn: 260110
* [SLP] Fix placement of debug statement (NFC)Igor Breger2016-02-081-7/+7
| | | | | | | | By Ayal Zaks (ayal.zaks@intel.com) Differential Revision: http://reviews.llvm.org/D16976 llvm-svn: 260094
* Revert r260086 and r260085. They have broken the memorySilviu Baranga2016-02-082-1/+2
| | | | | | sanitizer bots. llvm-svn: 260087
* [LoopVersioning] Don't assert when there are no memchecksSilviu Baranga2016-02-081-1/+0
| | | | | | | | | | | | We shouldn't assert when there are no memchecks, since we can have SCEV checks. There is already an assert covering the case where there are no SCEV checks or memchecks. This also changes the LAA pointer wrapping versioning test to use the loop versioning pass (this was how I managed to trigger the assert in the loop versioning pass). llvm-svn: 260086
* [SCEV][LAA] Add no wrap SCEV predicates and use use them to improve strided ↵Silviu Baranga2016-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | pointer detection Summary: This change adds no wrap SCEV predicates with: - support for runtime checking - support for expression rewriting: (sext ({x,+,y}) -> {sext(x),+,sext(y)} (zext ({x,+,y}) -> {zext(x),+,sext(y)} Note that we are sign extending the increment of the SCEV, even for the zext case. This is needed to cover the fairly common case where y would be a (small) negative integer. In order to do this, this change adds two new flags: nusw and nssw that are applicable to AddRecExprs and permit the transformations above. We also change isStridedPtr in LAA to be able to make use of these predicates. With this feature we should now always be able to work around overflow issues in the dependence analysis. Reviewers: mzolotukhin, sanjoy, anemet Subscribers: mzolotukhin, sanjoy, llvm-commits, rengolin, jmolloy, hfinkel Differential Revision: http://reviews.llvm.org/D15412 llvm-svn: 260085
* [asan] Introduce new hidden -asan-use-private-alias option.Maxim Ostapenko2016-02-081-6/+44
| | | | | | | | | | | | | | | | As discussed in https://github.com/google/sanitizers/issues/398, with current implementation of poisoning globals we can have some CHECK failures or false positives in case of mixing instrumented and non-instrumented code due to ASan poisons innocent globals from non-sanitized binary/library. We can use private aliases to avoid such errors. In addition, to preserve ODR violation detection, we introduce new __odr_asan_gen_XXX symbol for each instrumented global that indicates if this global was already registered. To detect ODR violation in runtime, we should only check the value of indicator and report an error if it isn't equal to zero. Differential Revision: http://reviews.llvm.org/D15642 llvm-svn: 260075
* [X86][AVX512] add intrinsics of Scalar FP to integer conversion with ↵Asaf Badouh2016-02-071-4/+4
| | | | | | | | rounding mode Differential Revision: http://reviews.llvm.org/D16629 llvm-svn: 260033
* Don't use module context here. It's unnecessary and makes it harder to write ↵Daniel Berlin2016-02-071-2/+2
| | | | | | unittests llvm-svn: 260015
* Compute live-in for MemorySSADaniel Berlin2016-02-071-1/+41
| | | | llvm-svn: 260014
* Only insert into definingblocks once per blockDaniel Berlin2016-02-071-1/+4
| | | | llvm-svn: 260013
* New Loop Versioning LICM PassAshutosh Nema2016-02-064-0/+636
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 llvm-svn: 259986
* [LoopUnrolling] Try harder to avoid rebuilding LCSSA when possible.Michael Zolotukhin2016-02-051-7/+56
| | | | | | | | | | | | | In r255133 (reapplied r253126) we started to avoid redundant recomputation of LCSSA after loop-unrolling. This patch moves one step further in this direction - now we can avoid it for much wider range of loops, as we start to look at IR and try to figure out if the transformation actually breaks LCSSA phis or makes it necessary to insert new ones. Differential Revision: http://reviews.llvm.org/D16838 llvm-svn: 259869
* [RS4GC] Pass DenseMap by reference, NFCJoseph Tremoulet2016-02-051-5/+4
| | | | | | | | | | | | | | | Summary: Passing the rematerialized values map to insertRematerializationStores by value looks to be a simple oversight; update it to pass by reference. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16911 llvm-svn: 259867
* [LoopLoadElim] Don't allow versioning when optForSizeAdam Nemet2016-02-051-2/+9
| | | | | | This was requested in the review of D16300. llvm-svn: 259861
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-041-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. The original commit triggered regressions in Polly tests. The regressions exposed two problems which have been fixed in current version. 1. Polly will generate a new function based on the old one. To generate an instruction for the new function, it builds SCEV for the old instruction, applies some tranformation on the SCEV generated, then expands the transformed SCEV and insert the expanded value into new function. Because SCEV expansion may reuse value cached in ExprValueMap, the value in old function may be inserted into new function, which is wrong. In SCEVExpander::expand, there is a logic to check the cached value to be used should dominate the insertion point. However, for the above case, the check always passes. That is because the insertion point is in a new function, which is unreachable from the old function. However for unreachable node, DominatorTreeBase::dominates thinks it will be dominated by any other node. The fix is to simply add a check that the cached value to be used in expansion should be in the same function as the insertion point instruction. 2. When the SCEV is of scConstant type, expanding it directly is cheaper than reusing a normal value cached. Although in the cached value set in ExprValueMap, there is a Constant type value, but it is not easy to find it out -- the cached Value set is not sorted according to the potential cost. Existing reuse logic in SCEVExpander::expand simply chooses the first legal element from the cached value set. The fix is that when the SCEV is of scConstant type, don't try the reuse logic. simply expand it. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259736
* [SimplifyCFG] Fix for "endless" loop after dead code removal (Alternative toGerolf Hoflehner2016-02-031-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | D16251) Summary: This is a simpler fix to the problem than the dominator approach in http://reviews.llvm.org/D16251. It adds only values into the gather() while loop that have been seen before. The actual endless loop is in the constant compare gather() routine in Utils/SimplifyCFG.cpp. The same value ret.0.off0.i is pushed back into the queue: %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i Here is what happens at the IR level: for.cond.i: ; preds = %if.end6.i, %if.end.i54 %ix.0.i = phi i32 [ 0, %if.end.i54 ], [ %inc.i55, %if.end6.i ] %ret.0.off0.i = phi i1 [false, %if.end.i54], [%.ret.0.off0.i, %if.end6.i] <<< %cmp2.i = icmp ult i32 %ix.0.i, %11 br i1 %cmp2.i, label %for.body.i, label %LBJ_TmpSimpleNeedExt.exit if.end6.i: ; preds = %for.body.i %cmp10.i = icmp ugt i32 %conv.i, %add9.i %.ret.0.off0.i = or i1 %ret.0.off0.i, %cmp10.i <<< When if.end.i54 gets eliminated which removes the definition of ret.0.off0.i. The result is the expression %.ret.0.off0.i = or i1 %.ret.0.off0.i, %cmp10.i (Note the first ‘or’ operand is now %.ret.0.off0.i, and *NOT* %ret.0.off0.i). And now there is use of .ret.0.off0.i before a definition which triggers the “endless” loop in gather(): while(!DFT.empty()) { V = DFT.pop_back_val(); // V is .ret.0.off0.i if (Instruction *I = dyn_cast<Instruction>(V)) { // If it is a || (or && depending on isEQ), process the operands. if (I->getOpcode() == (isEQ ? Instruction::Or : Instruction::And)) { DFT.push_back(I->getOperand(1)); // This is now .ret.0.off0.i also DFT.push_back(I->getOperand(0)); continue; // “endless loop” for .ret.0.off0.i } Reviewers: reames, ahatanak Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16839 llvm-svn: 259730
* [InstrProfiling] Fix a comment (NFC)Vedant Kumar2016-02-031-1/+1
| | | | llvm-svn: 259727
* Minor code cleanups. NFC.Junmo Park2016-02-031-18/+18
| | | | llvm-svn: 259725
* [LoopStrengthReduce] Don't rewrite PHIs with incoming values from CatchSwitchesDavid Majnemer2016-02-031-0/+11
| | | | | | | | | | Bail out if we have a PHI on an EHPad that gets a value from a CatchSwitchInst. Because the CatchSwitchInst cannot be split, there is no good place to stick any instructions. This fixes PR26373. llvm-svn: 259702
* Revert r259662, which caused regressions on polly tests.Wei Mi2016-02-031-16/+4
| | | | llvm-svn: 259675
* [InstCombine] Revert r238452: Fold IntToPtr and PtrToInt into preceding loads.Quentin Colombet2016-02-031-10/+5
| | | | | | | | | | | | | | | | | | | | | | | | | According to git bisect, this is the root cause of a miscompile for Regex in libLLVMSupport. I am still working on reducing a test case. The actual bug may be elsewhere and this commit just exposed it. Anyway, at the moment, to reproduce, follow these steps: 1. Build clang and libLTO in release mode. 2. Create a new build directory <stage2> and cd into it. 3. Use clang and libLTO from #1 to build llvm-extract in Release mode + asserts using -O2 -flto 4. Run llvm-extract -ralias '.*bar' -S test/Other/extract-alias.ll Result: program doesn't contain global named '.*bar'! Expected result: @a0a0bar = alias void ()* @bar @a0bar = alias void ()* @bar declare void @bar() Note: In step #3, if you don't use lto or asserts, the miscompile disappears. llvm-svn: 259674
* [SCEV] Try to reuse existing value during SCEV expansionWei Mi2016-02-031-4/+16
| | | | | | | | | | | | | | | | Current SCEV expansion will expand SCEV as a sequence of operations and doesn't utilize the value already existed. This will introduce redundent computation which may not be cleaned up throughly by following optimizations. This patch introduces an ExprValueMap which is a map from SCEV to the set of equal values with the same SCEV. When a SCEV is expanded, the set of values is checked and reused whenever possible before generating a sequence of operations. Differential Revision: http://reviews.llvm.org/D12090 llvm-svn: 259662
* LowerBitSets: Don't bother to do any work if the llvm.bitset.test intrinsic ↵Peter Collingbourne2016-02-031-1/+1
| | | | | | is unused. llvm-svn: 259625
* Add #include "llvm/Support/raw_ostream.h" to fix Windows build.Peter Collingbourne2016-02-031-0/+1
| | | | llvm-svn: 259623
* Transforms: Move GlobalOpt's Evaluator to Utils where it can be reused.Peter Collingbourne2016-02-033-655/+598
| | | | llvm-svn: 259621
* [LoopVersioning] Expose loop versioning as a pass tooAdam Nemet2016-02-032-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: LoopVersioning is a transform utility that transform passes can use to run-time disambiguate may-aliasing accesses. I'd like to also expose as pass to allow it to be unit-tested. I am planning to add support for non-aliasing annotation in LoopVersioning and I'd like to be able to write tests directly using this pass. (After that feature is done, the pass could also be used to look for optimization opportunities that are hidden behind incomplete alias information at compile time.) The pass drives LoopVersioning in its default way which is to fully disambiguate may-aliasing accesses no matter how many checks are required. Reviewers: hfinkel, ashutosh.nema, sbaranga Subscribers: zzheng, mssimpso, llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D16612 llvm-svn: 259610
* Attempt #2 to unbreak r259595.George Burgess IV2016-02-021-4/+4
| | | | llvm-svn: 259602
* Attempt to fix builds broken by r259595.George Burgess IV2016-02-021-1/+1
| | | | llvm-svn: 259599
* This patch adds MemorySSA to LLVM.George Burgess IV2016-02-023-0/+942
| | | | | | | | | Please see include/llvm/Transforms/Utils/MemorySSA.h for a description of MemorySSA, and what it does. Differential Revision: http://reviews.llvm.org/D7864 llvm-svn: 259595
* [asan] Add iOS support to AddressSanitzierAnna Zaks2016-02-021-3/+11
| | | | | | Differential Revision: http://reviews.llvm.org/D15625 llvm-svn: 259586
* Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.Eugene Zelenko2016-02-022-3/+0
| | | | | | Differential revision: http://reviews.llvm.org/D16793 llvm-svn: 259539
* function names start with a lowercase letter; NFCSanjay Patel2016-02-0114-324/+324
| | | | llvm-svn: 259425
* [InstCombine] simplify masked scatter/gather intrinsics with zero masksSanjay Patel2016-02-011-4/+22
| | | | | | | | | | | A masked scatter with a zero mask means there's no store. A masked gather with a zero mask means the passthru arg is returned. This is a continuation of: http://reviews.llvm.org/rL259369 http://reviews.llvm.org/rL259392 llvm-svn: 259421
* [InstCombine] simplify masked store intrinsics with all ones or zeros masksSanjay Patel2016-02-011-1/+21
| | | | | | | | | | A masked store with a zero mask means there's no store. A masked store with an allOnes mask means it's a normal vector store. This is a continuation of: http://reviews.llvm.org/rL259369 llvm-svn: 259392
* [InstCombine] Don't transform (X+INT_MAX)>=(Y+INT_MAX) -> (X<=Y)David Majnemer2016-02-011-1/+1
| | | | | | | | | This miscompile came about because we tried to use a transform which was only appropriate for xor operators when addition was present. This fixes PR26407. llvm-svn: 259375
OpenPOWER on IntegriCloud