summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [InstCombine] Use less bitwise operations to handle Instruction::SExt in ↵Craig Topper2017-05-241-19/+14
| | | | | | | | | | | | SimplifyDemandedUseBits. Other improvements. The current code created a NewBits mask and used it as a mask several times. One of them just before a call to trunc making it unnecessary. A call to getActiveBits can get us the same information for the case. We also ORed with this mask later when we should have just sign extended the known bits. We also called trunc on the guaranteed to be zero KnownZeros/Ones masks entering this code. Creating appropriately sized temporary APInts is probably better. Differential Revision: https://reviews.llvm.org/D32098 llvm-svn: 303779
* [ValueTracking] Convert most of the calls to computeKnownBits to use the ↵Craig Topper2017-05-249-48/+21
| | | | | | | | | | version that returns the KnownBits object. This continues the changes started when computeSignBit was replaced with this new version of computeKnowBits. Differential Revision: https://reviews.llvm.org/D33431 llvm-svn: 303773
* [LV] Update type in cost model for scalarizationMatthew Simpson2017-05-241-6/+15
| | | | | | | | | | | | | | For non-uniform instructions marked for scalarization, we should update `VectorTy` when computing instruction costs to reflect the scalar type. In addition to determining instruction costs, this type is also used to signal that all instructions in the loop will be scalarized. This currently affects memory instructions and non-pointer induction variables and their updates. (We also mark GEPs scalar after vectorization, but their cost is computed together with memory instructions.) For scalarized induction updates, this patch also scales the scalar cost by the vectorization factor, corresponding to each induction step. llvm-svn: 303763
* [LoopVectorizer] Let target prefer scalar addressing computations.Jonas Paulsson2017-05-241-0/+74
| | | | | | | | | | | | | | | | | | | | | | The loop vectorizer usually vectorizes any instruction it can and then extracts the elements for a scalarized use. On SystemZ, all elements containing addresses must be extracted into address registers (GRs). Since this extraction is not free, it is better to have the address in a suitable register to begin with. By forcing address arithmetic instructions and loads of addresses to be scalar after vectorization, two benefits result: * No need to extract the register * LSR optimizations trigger (LSR isn't handling vector addresses currently) Benchmarking show improvements on SystemZ with this new behaviour. Any other target could try this by returning false in the new hook prefersVectorizedAddressing(). Review: Renato Golin, Elena Demikhovsky, Ulrich Weigand https://reviews.llvm.org/D32422 llvm-svn: 303744
* [NewGVN] Update additionalUsers when we simplify to a value.Davide Italiano2017-05-241-0/+4
| | | | | | | | | | Otherwise we don't revisit an instruction that could be simplified, and when we verify, we discover there's something that changed, i.e. what we had wasn't a maximal fixpoint. Fixes PR32836. llvm-svn: 303715
* Revert "Disable coverage opt-out for strong postdominator blocks."George Karpenkov2017-05-241-2/+22
| | | | | | | This reverts commit 2ed06f05fc10869dd1239cff96fcdea2ee8bf4ef. Buildbots do not like this on Linux. llvm-svn: 303710
* [SCCP] Use the `hasAddressTaken()` version defined in `Function`.Davide Italiano2017-05-231-1/+2
| | | | | | | | | | Instead of using the SCCP homegrown one. We should eventually make the private SCCP version disappear, but that wont' be today. PR33143 tracks this issue. Add braces for consistency while here. No functional change intended. llvm-svn: 303706
* [LIR] Use the newly `getRecurrenceVar()` helper. NFCI.Davide Italiano2017-05-231-4/+4
| | | | llvm-svn: 303704
* [LIR] Strengthen the check for recurrence variable in popcnt/CTLZ.Davide Italiano2017-05-231-9/+16
| | | | | | | Fixes PR33114. Differential Revision: https://reviews.llvm.org/D33420 llvm-svn: 303700
* Disable coverage opt-out for strong postdominator blocks.George Karpenkov2017-05-231-22/+2
| | | | | | | | | | | | | | | | Coverage instrumentation has an optimization not to instrument extra blocks, if the pass is already "accounted for" by a successor/predecessor basic block. However (https://github.com/google/sanitizers/issues/783) this reasoning may become circular, which stops valid paths from having coverage. In the worst case this can cause fuzzing to stop working entirely. This change simplifies logic to something which trivially can not have such circular reasoning, as losing valid paths does not seem like a good trade-off for a ~15% decrease in the # of instrumented basic blocks. llvm-svn: 303698
* [InstCombine] allow icmp-xor folds for vectors (PR33138)Sanjay Patel2017-05-231-5/+9
| | | | | | | | | This fixes the first part of: https://bugs.llvm.org/show_bug.cgi?id=33138 More work is needed for the bitcasted variant. llvm-svn: 303660
* [IR] Switch AttributeList to use an array for O(1) accessReid Kleckner2017-05-231-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before this change, AttributeLists stored a pair of index and AttributeSet. This is memory efficient if most arguments do not have attributes. However, it requires doing a search over the pairs to test an argument or function attribute. Profiling shows that this loop was 0.76% of the time in 'opt -O2' of sqlite3.c, because LLVM constantly tests values for nullability. This was worth about 2.5% of mid-level optimization cycles on the sqlite3 amalgamation. Here are the full perf results: https://reviews.llvm.org/P7995 Here are just the before and after cycle counts: ``` $ perf stat -r 5 ./opt_before -O2 sqlite3.bc -o /dev/null 13,274,181,184 cycles # 3.047 GHz ( +- 0.28% ) $ perf stat -r 5 ./opt_after -O2 sqlite3.bc -o /dev/null 12,906,927,263 cycles # 3.043 GHz ( +- 0.51% ) ``` This patch *does not* change the indices used to query attributes, as requested by reviewers. Tracking whether an index is usable for array indexing is a huge pain that affects many of the internal APIs, so it would be good to come back later and do a cleanup to remove this internal adjustment. Reviewers: pete, chandlerc Subscribers: javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D32819 llvm-svn: 303654
* [JumpThreading] Safely replace uses of conditionAnna Thomas2017-05-232-2/+57
| | | | | | | | | | | | | | | | | | | | | | This patch builds over https://reviews.llvm.org/rL303349 and replaces the use of the condition only if it is safe to do so. We should not blindly RAUW the condition if experimental.guard or assume is a use of that condition. This is because LVI may have used the guard/assume to identify the value of the condition, and RUAWing will fold the guard/assume and uses before the guards/assumes. Reviewers: sanjoy, reames, trentxintong, mkazantsev Reviewed by: sanjoy, reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33257 llvm-svn: 303633
* [KnownBits] Use !hasConflict() in asserts in place of Zero & One == 0 or ↵Craig Topper2017-05-231-16/+16
| | | | | | similar. NFC llvm-svn: 303614
* [LV] Report multiple reasons for not vectorizing under allowExtraAnalysisAyal Zaks2017-05-231-20/+42
| | | | | | | | | | | | | | | | | | | | The default behavior of -Rpass-analysis=loop-vectorizer is to report only the first reason encountered for not vectorizing, if one is found, at which time the vectorizer aborts its handling of the loop. This patch allows multiple reasons for not vectorizing to be identified and reported, at the potential expense of additional compile-time, under allowExtraAnalysis which can currently be turned on by Clang's -fsave-optimization-record and opt's -pass-remarks-missed. Removed from LoopVectorizationLegality::canVectorize() the redundant checking and reporting if we CantComputeNumberOfIterations, as LAI::canAnalyzeLoop() also does that. This redundancy is caught by a lit test once multiple reasons are reported. Patch initially developed by Dror Barak. Differential Revision: https://reviews.llvm.org/D33396 llvm-svn: 303613
* Fix update VP metadata after inlining for instrumentation PGOTeresa Johnson2017-05-221-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: With instrumentation profiling, when updating the VP metadata after an inline, VP metadata on the inlined copy was inadvertantly having all counts zeroed out. This was causing indirect calls from code inlined during the call step to be marked as cold in the ThinLTO summaries and not imported. The CallerBFI needs to be passed down so that the CallSiteCount can be computed from the profile summary info. With Sample PGO this was working since the count is extracted from the branch weight metadata on the call being inlined (even before we stopped looking at metadata for non-sample PGO in r302844 this largely wasn't working for instrumentation PGO since only promoted indirect calls would be getting inlined and have the metadata). Added an instrumentation PGO test and renamed the sample PGO test. Reviewers: danielcdh, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D33389 llvm-svn: 303574
* [PartialInlining] Add internal options to enable partial inlining in pass ↵Xinliang David Li2017-05-221-2/+8
| | | | | | | | | | | pipeline (off by default) 1. Legacy: -mllvm -enable-partial-inlining 2. New: -mllvm -enable-npm-partial-inlining -fexperimental-new-pass-manager Differential Revision: http://reviews.llvm.org/D33382 llvm-svn: 303567
* [LoopPredication] NFC. Add extra debug output in case we fail to parse the ↵Artur Pilipenko2017-05-221-1/+3
| | | | | | range check llvm-svn: 303544
* [LoopPredication] NFC. Move a nested struct declaration before the fields, ↵Artur Pilipenko2017-05-221-7/+9
| | | | | | | | clang-format a bit This will simplify the diff for an upcoming review. llvm-svn: 303543
* [InstCombine] Cleanup the interface for overflow checksCraig Topper2017-05-224-39/+50
| | | | | | | | | | | | | | | | | | Summary: Fix naming conventions and const correctness. This completes the changes made in rL303029. Patch by Yoav Ben-Shalom. Reviewers: craig.topper Reviewed By: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33377 llvm-svn: 303529
* [SimplifyCFG] Prevent a few APInt copies on method calls that return const ↵Craig Topper2017-05-221-2/+2
| | | | | | reference. NFCI llvm-svn: 303523
* [KnownBits] Use isNegative/isNonNegative to shorten some code. NFCCraig Topper2017-05-221-2/+2
| | | | llvm-svn: 303522
* NewGVN: Fix PR 33116, the memoryphi version of bug 32838.Daniel Berlin2017-05-211-6/+5
| | | | llvm-svn: 303521
* NewGVN: Cleanup some repeated code using some templated helpersDaniel Berlin2017-05-211-40/+40
| | | | llvm-svn: 303520
* NewGVN: Fix printing of simplified expressionDaniel Berlin2017-05-211-1/+1
| | | | llvm-svn: 303519
* [InstCombine] Take in account the size in sext->lshr->trunc patterns.Davide Italiano2017-05-211-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we end up miscompiling, transforming: define i8 @tinky() { %sext = sext i1 1 to i16 %hibit = lshr i16 %sext, 15 %tr = trunc i16 %hibit to i8 ret i8 %tr } into: %sext = sext i1 1 to i8 ret i8 %sext and the first get folded to ret i8 1, while the second gets folded to ret i8 -1. Eventually we should get rid of this transform entirely, but for now, this at least fixes a know correctness bug. Differential Revision: https://reviews.llvm.org/D33338 llvm-svn: 303513
* Revert "Add pthread_self function prototype and make it speculatable."Xin Tong2017-05-211-12/+0
| | | | | | | | This reverts commit 143d7445b5dfa2f6d6c45bdbe0433d9fc531be21. Build breaking llvm-svn: 303496
* Add pthread_self function prototype and make it speculatable.Xin Tong2017-05-201-0/+12
| | | | | | | | | | | | | | Summary: This allows pthread_self to be pulled out of a loop by LICM. Reviewers: hfinkel, arsenm, davide Reviewed By: davide Subscribers: davide, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D32782 llvm-svn: 303495
* [NewGVN] Create a StoreExpression instead of a VariableExpression.Davide Italiano2017-05-201-1/+1
| | | | | | | | | | | | | In the case where we have an operand defined by a lod of the same memory location. Historically this was a VariableExpression because we wanted to make sure they ended up in the same class, but if we create the right expression, they end up in the same class anyway. Fixes PR32897. Thanks to Dan for the detailed discussion and the fix suggestion. llvm-svn: 303475
* [NewGVN] Get rid of an assertion.Davide Italiano2017-05-201-1/+0
| | | | | | | | | | This was here because we don't want to switch leaders too much, in order to avoid fixpoint(ing) issue, but it's not sure if it matters in practice. A first step towards fixing PR32897. llvm-svn: 303473
* Revert "ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator."Adrian Prantl2017-05-191-4/+1
| | | | | | This reverts commit r303438 while deliberating buildbot breakage. llvm-svn: 303467
* SimplifyLibCalls: Optimize wcslenMatthias Braun2017-05-191-28/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | Refactor the strlen optimization code to work for both strlen and wcslen. This especially helps with programs in the wild where people pass L"string"s to const std::wstring& function parameters and the wstring constructor gets inlined. This also fixes a lingerind API problem/bug in getConstantStringInfo() where zeroinitializers would always give you an empty string (without a length) back regardless of the actual length of the initializer which did not work well in the TrimAtNul==false causing the PR mentioned below. Note that the fixed getConstantStringInfo() needed fixes to SelectionDAG memcpy lowering and may lead to some cases for out-of-bounds zeroinitializer accesses not getting optimized anymore. So some code with UB may produce out of bound memory reads now instead of just producing zeros. The refactoring "accidentally" fixes http://llvm.org/PR32124 Differential Revision: https://reviews.llvm.org/D32839 llvm-svn: 303461
* NewGVN: Fix PR32838.Daniel Berlin2017-05-191-22/+53
| | | | | | | | | | This is a complicated bug involving two issues: 1. What do we do with phi nodes when we prove all arguments are not live? 2. When is it safe to use value leaders to determine if we can ignore an argumnet? llvm-svn: 303453
* Last of the major pieces to NewGVN - yay!Daniel Berlin2017-05-191-117/+525
| | | | | | | | | | | | | | | | | | | | | | Summary: NewGVN: Handle equivalence between phi of ops and op of phis. This makes our GVN mostly-complete. It would be complete, modulo some deliberate choices we make. This means it detects roughly all herband equivalences in polynomial time, including cases notoriously hard for other GVN's to detect. It also detects a very large swath of the cases we currently rely on instcombine to detect that involve folding upwards through phis. Fixes PR 31125, 31463, PR 31868 Reviewers: davide Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D32151 llvm-svn: 303444
* NewGVN: Get rid of most dominating leader checkDaniel Berlin2017-05-191-24/+2
| | | | llvm-svn: 303443
* [NFC][loopIdiom] Clang format change rL303434Anna Thomas2017-05-191-3/+5
| | | | llvm-svn: 303439
* ThinLTO: Verify bitcode before lauching the ThinLTOCodeGenerator.Adrian Prantl2017-05-191-1/+4
| | | | | | | | rdar://problem/31233625 Differential Revision: https://reviews.llvm.org/D33151 llvm-svn: 303438
* [LoopIdiom] Refactor return value of isLegalStore [NFC]Anna Thomas2017-05-191-31/+38
| | | | | | | | | | | | | | | | | | | | | | | Summary: This NFC simply refactors the return value of LoopIdiomRecognize::isLegalStore() from bool to an enumeration, and removes the return-through-parameter mechanism that the function was using. This function is constructed such that it will only ever recognize a single store idiom (memset, memset_pattern, or memcpy), and never a combination of these. As such it makes much more sense for the return value to be the single idiom that the store matches, rather than having a separate argument-return for each idiom -- it's cleaner, and makes it clearer that only a single idiom can be matched. Patch by Daniel Neilson! Reviewers: anna, sanjoy, davide, haicheng Reviewed By: anna, haicheng Subscribers: haicheng, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D33359 llvm-svn: 303434
* [LoopPredication] NFC. Extract LoopICmp struct and parseLoopICmp helperArtur Pilipenko2017-05-191-18/+44
| | | | llvm-svn: 303427
* [LoopPredication] NFC. Extract LoopPredication::expandCheck helperArtur Pilipenko2017-05-191-5/+16
| | | | llvm-svn: 303426
* [LoopPredication] NFC. Extract CanExpand helper lambdaArtur Pilipenko2017-05-191-2/+6
| | | | llvm-svn: 303425
* [LoopPredication] NFC. Add an early exit if there is no guards in the loopArtur Pilipenko2017-05-191-0/+3
| | | | llvm-svn: 303424
* Fix vector pass-through value being unused in IRBuilder::CreateMaskedGatherAmara Emerson2017-05-191-1/+1
| | | | | | Also s/0/nullptr in the call site in LV. llvm-svn: 303416
* [NewGVN] Delete the old store when we find congruent to a load.Davide Italiano2017-05-191-2/+2
| | | | | | | (or non-store, more in general). Fixes PR33086. Caught by the store verifier. llvm-svn: 303406
* [NewGVN] Break infinite recursion in singleReachablePHIPath().Davide Italiano2017-05-181-6/+20
| | | | | | | | | | | | We can have cycles between PHIs and this causes singleReachablePhi() to call itself indefintely (until we run out of stack). The proper solution would be that of computing SCCs, but it's not worth for now, so just keep a visited set and give up when we find a cycle. Thanks to Dan for the discussion/help with this. Fixes PR33014. llvm-svn: 303393
* [NewGVN] Replace predicate info leftovers.Davide Italiano2017-05-181-0/+6
| | | | | | | This time with an additional fix, i.e. we remove the dead @llvm.ssa.copy instruction. llvm-svn: 303385
* [InstCombine] add helper to foldXorOfICmps(); NFCISanjay Patel2017-05-182-41/+48
| | | | | | | | | | | Also, fix the old-style capitalization of the related functions and move them to the 'private' section of the class since they are just helpers of the visit* functions. As shown in the post-commit comments for D32143, we are missing folds for xor-of-icmps. llvm-svn: 303381
* [IR] De-virtualize ~Value to save a vptrReid Kleckner2017-05-189-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362
* [LSR] Call canonicalize after we generate a new Formula in ↵Wei Mi2017-05-181-0/+1
| | | | | | | | | | | | | | GenerateTruncates. Fix PR33077. The testcase in PR33077 generates a LSR Use Formula with two SCEVAddRecExprs for the same loop. Such uncommon formula will become non-canonical after GenerateTruncates adds sign extension to the ScaledReg of the Formula, and it will break the assertion that every Formula to be inserted is canonical. The fix is to call canonicalize for the raw Formula generated by GenerateTruncates before inserting it. llvm-svn: 303361
* [JumpThreading] Dont RAUW condition incorrectlyAnna Thomas2017-05-181-17/+14
| | | | | | | | | | | | | | | | | | | Summary: We have a bug when RAUWing the condition if experimental.guard or assumes is a use of that condition. This is because LazyValueInfo may have used the guards/assumes to identify the value of the condition at the end of the block. RAUW replaces the uses at the guard/assume as well as uses before the guard/assume. Both of these are incorrect. For now, disable RAUW for conditions and fix the logic as a next step: https://reviews.llvm.org/D33257 Reviewers: sanjoy, reames, trentxintong Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33279 llvm-svn: 303349
OpenPOWER on IntegriCloud