summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [NFC] Fix variables used only for assert in GVNMax Kazantsev2017-10-111-6/+6
| | | | llvm-svn: 315448
* [GVN] Prevent LoadPRE from hoisting across instructions that don't pass ↵Max Kazantsev2017-10-111-0/+77
| | | | | | | | | | | | | | | | | | control flow to successors This patch fixes the miscompile that happens when PRE hoists loads across guards and other instructions that don't always pass control flow to their successors. PRE is now prohibited to hoist across such instructions because there is no guarantee that the load standing after such instruction is still valid before such instruction. For example, a load from under a guard may be invalid before the guard in the following case: int array[LEN]; ... guard(0 <= index && index < LEN); use(array[index]); Differential Revision: https://reviews.llvm.org/D37460 llvm-svn: 315440
* [LICM] Disallow sinking of unordered atomic loads into loopsMax Kazantsev2017-10-111-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | Sinking of unordered atomic load into loop must be disallowed because it turns a single load into multiple loads. The relevant section of the documentation is: http://llvm.org/docs/Atomics.html#unordered, specifically the Notes for Optimizers section. Here is the full text of this section: > Notes for optimizers > In terms of the optimizer, this **prohibits any transformation that > transforms a single load into multiple loads**, transforms a store into > multiple stores, narrows a store, or stores a value which would not be > stored otherwise. Some examples of unsafe optimizations are narrowing > an assignment into a bitfield, rematerializing a load, and turning loads > and stores into a memcpy call. Reordering unordered operations is safe, > though, and optimizers should take advantage of that because unordered > operations are common in languages that need them. Patch by Daniil Suchkov! Reviewed By: reames Differential Revision: https://reviews.llvm.org/D38392 llvm-svn: 315438
* [IRCE] Do not process empty safe rangesMax Kazantsev2017-10-111-3/+13
| | | | | | | | | | | | | | | IRCE should not apply when the safe iteration range is proved to be empty. In this case we do unneeded job creating pre/post loops and then never go to the main loop. This patch makes IRCE not apply to empty safe ranges, adds test for this situation and also modifies one of existing tests where it used to happen slightly. Reviewed By: anna Differential Revision: https://reviews.llvm.org/D38577 llvm-svn: 315437
* [GVN] Don't replace constants with constants.Davide Italiano2017-10-111-0/+5
| | | | | | | | This fixes PR34908. Patch by Alex Crichton! Differential Revision: https://reviews.llvm.org/D38765 llvm-svn: 315429
* [Transforms] Fix some Clang-tidy modernize and Include What You Use ↵Eugene Zelenko2017-10-107-120/+252
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 315383
* Use the first instruction's count to estimate the funciton's entry frequency.Dehao Chen2017-10-101-10/+19
| | | | | | | | | | | | | | Summary: In the current implementation, we only have accurate profile count for standalone symbols. For inlined functions, we do not have entry count data because it's not available in LBR. In this patch, we use the first instruction's frequency to estimiate the function's entry count, especially for inlined functions. This may be inaccurate due to debug info in optimized code. However, this is a better estimate than the static 80/20 estimation we have in the current implementation. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits, aprantl Differential Revision: https://reviews.llvm.org/D38478 llvm-svn: 315369
* Revert "[SCCP] Propagate integer range info for parameters in IPSCCP."Bruno Cardoso Lopes2017-10-101-90/+8
| | | | | | | | | This reverts commit r315288. This is part of fixing segfault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315329
* Revert "[SCCP] Fix mem-sanitizer failure introduced by r315288."Bruno Cardoso Lopes2017-10-101-4/+2
| | | | | | | This reverts commit r315294. Part of fixing seg fault introduced in: http://green.lab.llvm.org/green/job/clang-stage2-configure-Rlto/21675/ llvm-svn: 315328
* [SCCP] Fix mem-sanitizer failure introduced by r315288.Florian Hahn2017-10-101-2/+4
| | | | llvm-svn: 315294
* [SCCP] Propagate integer range info for parameters in IPSCCP.Florian Hahn2017-10-101-8/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This updates the SCCP solver to use of the ValueElement lattice for parameters, which provides integer range information. The range information is used to remove unneeded icmp instructions. For the following function, f() can be optimized to `ret i32 2` with this change source_filename = "sccp.c" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" ; Function Attrs: norecurse nounwind readnone uwtable define i32 @main() local_unnamed_addr #0 { entry: %call = tail call fastcc i32 @f(i32 1) %call1 = tail call fastcc i32 @f(i32 47) %add3 = add nsw i32 %call, %call1 ret i32 %add3 } ; Function Attrs: noinline norecurse nounwind readnone uwtable define internal fastcc i32 @f(i32 %x) unnamed_addr #1 { entry: %c1 = icmp sle i32 %x, 100 %cmp = icmp sgt i32 %x, 300 %. = select i1 %cmp, i32 1, i32 2 ret i32 %. } attributes #1 = { noinline } Reviewers: davide, sanjoy, efriedma, dberlin Reviewed By: davide, dberlin Subscribers: mcrosier, gberry, mssimpso, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D36656 llvm-svn: 315288
* Re-land "[MergeICmps] Disable mergeicmps if the target does not want to ↵Clement Courbet2017-10-101-12/+36
| | | | | | | | | | handle memcmp expansion." (fixed stability issues) This reverts commit d6492333d3b478a1d88163315002022f8d5e58dc. llvm-svn: 315281
* Renable r314928Xinliang David Li2017-10-102-0/+239
| | | | | | | | | | | Eliminate inttype phi with inttoptr/ptrtoint. This version fixed a bug in finding the matching phi -- the order of the incoming blocks may be different (triggered in self build on Windows). A new test case is added. llvm-svn: 315272
* Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.*Adam Nemet2017-10-0919-19/+19
| | | | | | | Sync it up with the name of the class actually defined here. This has been bothering me for a while... llvm-svn: 315249
* [InstCombine] fix formatting; NFCSanjay Patel2017-10-091-9/+7
| | | | llvm-svn: 315223
* [InstCombine] use correct type when propagating constant condition in ↵Sanjay Patel2017-10-061-2/+3
| | | | | | simplifyDivRemOfSelectWithZeroOp (PR34856) llvm-svn: 315130
* [InstCombine] rename SimplifyDivRemOfSelect to be clearer, add comments, ↵Sanjay Patel2017-10-062-20/+20
| | | | | | | | | simplify code; NFCI There's at least one bug here - this code can fail with vector types (PR34856). It's also being called for FREM; I'm still trying to understand how that is valid. llvm-svn: 315127
* Revert "Roll forward r314928"Reid Kleckner2017-10-062-238/+0
| | | | | | | | | | | | | | | | | | | | This appears to be miscompiling Clang, as shown on two Windows bootstrap bots: http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/7611 http://lab.llvm.org:8011/builders/clang-x64-ninja-win7/builds/6870 Nothing else is in the blame list. Both emit errors on this valid code in the Windows ucrt headers: C:\...\ucrt\malloc.h:95:32: error: invalid operands to binary expression ('char *' and 'int') _Ptr = (char*)_Ptr + _ALLOCA_S_MARKER_SIZE; ~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~ I am attempting to reproduce this now. This reverts r315044 llvm-svn: 315108
* Directly return promoted direct call instead of rely on stripPointerCast.Dehao Chen2017-10-062-11/+15
| | | | | | | | | | | | | | Summary: stripPointerCast is not reliably returning the value that's being type-casted. Instead it may look further at function attributes to further propagate the value. Instead of relying on stripPOintercast, the more reliable solution is to directly use the pointer to the promoted direct call. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D38603 llvm-svn: 315077
* Revert "[MergeICmps] Disable mergeicmps if the target does not want to ↵Clement Courbet2017-10-061-36/+12
| | | | | | | | | | handle memcmp expansion." Still a few stability issues on windows. This reverts commit 67e3db9bc121ba244e20337aabc7cf341a62b545. llvm-svn: 315058
* Re-land "[MergeICmps] Disable mergeicmps if the target does not want to ↵Clement Courbet2017-10-061-12/+36
| | | | | | | | | | handle memcmp expansion." (fixed unit tests by making comparisons stable) This reverts commit 1b2d359ce256fd6737da4e93833346a0bd6d7583. llvm-svn: 315056
* Roll forward r314928Xinliang David Li2017-10-062-0/+238
| | | | | | | Fixed ThinLTO bootstrap failure : track new bitcast per incomingVal. Added new tests. llvm-svn: 315044
* [PM] Retire disable unit-at-a-time switch.Davide Italiano2017-10-061-33/+24
| | | | | | | | | | | | This is a vestige from the GCC-3 days, which disables IPO passes when set. I don't think anybody actually uses it as there are several IPO passes which still run with this flag set and nobody complained/noticed. This reduces the delta between current and new pass manager and allows us to easily review the difference when we decide to flip the switch (or audit which passes should run, FWIW). llvm-svn: 315043
* [CodeExtractor] Fix multiple bugs under certain shape of extracted regionJakub Kuderski2017-10-061-77/+31
| | | | | | | | | | | | | | | | Summary: If the extracted region has multiple exported data flows toward the same BB which is not included in the region, correct resotre instructions and PHI nodes won't be generated inside the exitStub. The solution is simply put the restore instructions right after the definition of output values instead of putting in exitStub. Unittest for this bug is included. Author: myhsu Reviewers: chandlerc, davide, lattner, silvas, davidxl, wmi, kuhar Subscribers: dberlin, kuhar, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D37902 llvm-svn: 315041
* NewGVN: Factor out duplicate parts of OpIsSafeForPHIOfOpsDaniel Berlin2017-10-061-45/+31
| | | | llvm-svn: 315040
* ModuleUtils: Stop using comdat members to generate unique module ids.Peter Collingbourne2017-10-051-1/+1
| | | | | | | | | | | It is possible for two modules to define the same set of external symbols without causing a duplicate symbol error at link time, as long as each of the symbols is a comdat member. So we cannot use them as part of a unique id for the module. Differential Revision: https://reviews.llvm.org/D38602 llvm-svn: 315026
* [InstCombine] improve folds for icmp gt/lt (shr X, C1), C2Sanjay Patel2017-10-051-37/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can always eliminate the shift in: icmp gt/lt (shr X, C1), C2 --> icmp gt/lt X, C' This patch was supposed to just be an efficiency improvement because we were doing this 3-step process to fold: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: ADD: %1 = udiv i4 %x, 2 IC: Old = %c = icmp ugt i4 %1, 1 New = <badref> = icmp uge i4 %x, 4 IC: ADD: %c = icmp uge i4 %x, 4 IC: ERASE %2 = icmp ugt i4 %1, 1 IC: Visiting: %c = icmp uge i4 %x, 4 IC: Old = %c = icmp uge i4 %x, 4 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %2 = icmp uge i4 %x, 4 IC: Visiting: %c = icmp ugt i4 %x, 3 IC: DCE: %1 = udiv i4 %x, 2 IC: ERASE %1 = udiv i4 %x, 2 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: ret i1 %c When we could go directly to canonical icmp form: IC: Visiting: %c = icmp ugt i4 %s, 1 IC: Old = %c = icmp ugt i4 %s, 1 New = <badref> = icmp ugt i4 %x, 3 IC: ADD: %c = icmp ugt i4 %x, 3 IC: ERASE %1 = icmp ugt i4 %s, 1 IC: ADD: %s = lshr i4 %x, 1 IC: DCE: %s = lshr i4 %x, 1 IC: ERASE %s = lshr i4 %x, 1 IC: Visiting: %c = icmp ugt i4 %x, 3 ...but then I noticed that the folds were incomplete too: https://godbolt.org/g/aB2hLE Here are attempts to prove the logic with Alive: https://rise4fun.com/Alive/92o Name: lshr_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) Name: ashr_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_ugt Pre: (((C2+1) << C1) u>> C1) == (C2+1) %sh = lshr i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, ((C2+1) << C1) - 1 Name: ashr_sgt Pre: (C2 != 127) && ((C2+1) << C1 != -128) && (((C2+1) << C1) >> C1) == (C2+1) %sh = ashr i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, ((C2+1) << C1) - 1 Name: ashr_exact_sgt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp sgt i8 %sh, C2 => %r = icmp sgt i8 %x, (C2 << C1) Name: ashr_exact_slt Pre: ((C2 << C1) >> C1) == C2 %sh = ashr exact i8 %x, C1 %r = icmp slt i8 %sh, C2 => %r = icmp slt i8 %x, (C2 << C1) Name: lshr_exact_ugt Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ugt i8 %sh, C2 => %r = icmp ugt i8 %x, (C2 << C1) Name: lshr_exact_ult Pre: ((C2 << C1) u>> C1) == C2 %sh = lshr exact i8 %x, C1 %r = icmp ult i8 %sh, C2 => %r = icmp ult i8 %x, (C2 << C1) We did something similar for 'shl' in D28406. Differential Revision: https://reviews.llvm.org/D38514 llvm-svn: 315021
* Annotate VP prof on indirect call if it is ICPed in the profiled binary.Dehao Chen2017-10-051-1/+3
| | | | | | | | | | | | | | Summary: In SamplePGO, when an indirect call is promoted in the profiled binary, before profile annotation, it will be promoted and inlined. For the original indirect call, the current implementation will not mark VP profile on it. This is an issue when profile becomes stale. This patch annotates VP prof on indirect calls during annotation. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, llvm-commits Differential Revision: https://reviews.llvm.org/D38477 llvm-svn: 315016
* [PassManager] Improve the interaction between -O2 and ThinLTO.Davide Italiano2017-10-051-13/+11
| | | | | | | Run GDCE slightly later so that we don't have to repeat it twice when preparing for Thin. Thanks to Mehdi for the suggestion. llvm-svn: 314999
* [PassManager] Run global optimizations after the inliner.Davide Italiano2017-10-051-0/+14
| | | | | | | | | | | | | | | The inliner performs some kind of dead code elimination as it goes, but there are cases that are not really caught by it. We might at some point consider teaching the inliner about them, but it is OK for now to run GlobalOpt + GlobalDCE in tandem as their benefits generally outweight the cost, making the whole pipeline faster. This fixes PR34652. Differential Revision: https://reviews.llvm.org/D38154 llvm-svn: 314997
* [LV] Fix PR34743 - handle casts that sink after interleaved loadsAyal Zaks2017-10-051-1/+4
| | | | | | | | | | | When ignoring a load that participates in an interleaved group, make sure to move a cast that needs to sink after it. Testcase derived from reproducer of PR34743. Differential Revision: https://reviews.llvm.org/D38338 llvm-svn: 314986
* Revert "Re-land "[MergeICmps] Disable mergeicmps if the target does not want ↵Clement Courbet2017-10-051-25/+10
| | | | | | | | | | to handle memcmp expansion.""" broken test on windows This reverts commit c91479518344fd1fc071c5bd5848f6eb83e53dca. llvm-svn: 314985
* revert r314698 - [InstCombine] remove one-use restriction for icmp (shr ↵Sanjay Patel2017-10-051-6/+6
| | | | | | | | | | | exact X, C1), C2 --> icmp X, (C2<<C1) There is a bot failure that appears to be related to this change: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-selfhost-neon/builds/2117 ...so reverting to confirm that and attempting to keep the bot green while investigating. llvm-svn: 314984
* [LV] Fix PR34711 - widen instruction ranges when sinking castsAyal Zaks2017-10-051-23/+22
| | | | | | | | | | | | | | | | Instead of trying to keep LastWidenRecipe updated after creating each recipe, have tryToWiden() retrieve the last recipe of the current VPBasicBlock and check if it's a VPWidenRecipe when attempting to extend its range. This ensures that such extensions, optimized to maintain the original instruction order, do so only when the instructions are to maintain their relative order. The latter does not always hold, e.g., when a cast needs to sink to unravel first order recurrence (r306884). Testcase derived from reproducer of PR34711. Differential Revision: https://reviews.llvm.org/D38339 llvm-svn: 314981
* Re-land "[MergeICmps] Disable mergeicmps if the target does not want to ↵Clement Courbet2017-10-051-10/+25
| | | | | | handle memcmp expansion."" llvm-svn: 314980
* Revert "[MergeICmps] Disable mergeicmps if the target does not want to ↵Clement Courbet2017-10-051-15/+6
| | | | | | | | | | | handle memcmp expansion." Breaks clang-stage1-cmake-RA-incremental/llvm/test/Transforms/MergeICmps/X86/tuple-four-int8.ll This reverts commit 3038c459d67f8898ffa295d54a013b280690abfa. llvm-svn: 314972
* [InstCombine] Fix a vector splat handling bug in transformZExtICmp.Craig Topper2017-10-051-3/+1
| | | | | | | | | | We were using an i1 type and then zero extending to a vector. Instead just create the 0/1 directly as a ConstantInt with the correct type. No need to ask ConstantExpr to zero extend for us. This bug is a bit tricky to hit because it requires us to visit a zext of an icmp that would normally be simplified to true/false, but that icmp hasnt' been visited yet. In the test case this zext and icmp were created by visiting a udiv and due to worklist ordering we got to the zext first. Fixes PR34841. llvm-svn: 314971
* [MergeICmps] Disable mergeicmps if the target does not want to handle memcmp ↵Clement Courbet2017-10-051-6/+15
| | | | | | | | | | | | | | expansion. Summary: This is to avoid e.g. merging two cheap icmps if the target is not going to expand to something nice later. Reviewers: dberlin, spatel Subscribers: davide, nemanjai Differential Revision: https://reviews.llvm.org/D38232 llvm-svn: 314970
* Revert r314928 to investigate thinLTO bootstrap failureXinliang David Li2017-10-052-234/+0
| | | | llvm-svn: 314961
* [InstCombine] Improve support for ashr in foldICmpAndShiftCraig Topper2017-10-041-9/+12
| | | | | | | | We can support ashr similar to lshr, if we know that none of the shifted in bits are used. In that case SimplifyDemandedBits would normally convert it to lshr. But that conversion doesn't happen if the shift has additional users. Differential Revision: https://reviews.llvm.org/D38521 llvm-svn: 314945
* Fix a -Wparentheses warning. NFC.Hans Wennborg2017-10-041-1/+1
| | | | llvm-svn: 314936
* [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFCMarcello Maggioni2017-10-042-134/+124
| | | | llvm-svn: 314934
* [SimplifyCFG] put the optional assumption cache pointer in the options ↵Sanjay Patel2017-10-042-52/+45
| | | | | | | | | | | | struct; NFCI This is a follow-up to https://reviews.llvm.org/D38138. I fixed the capitalization of some functions because we're changing those lines anyway and that helped verify that we weren't accidentally dropping any options by using default param values. llvm-svn: 314930
* Recommit r314561 after fixing msan build failureXinliang David Li2017-10-042-0/+234
| | | | | | | (trial 2) Incoming val defined by terminator instruction which also requires bitcasts can not be handled. llvm-svn: 314928
* Recommit : Use the basic cost if a GEP is not used as addressing modeJun Bum Lim2017-10-042-2/+2
| | | | | | | | | | | | | | Recommitting r314517 with the fix for handling ConstantExpr. Original commit message: Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target. However, since it doesn't check its actual users, it will return FREE even in cases where the GEP cannot be folded away as a part of actual addressing mode. For example, if an user of the GEP is a call instruction taking the GEP as a parameter, then the GEP may not be folded in isel. llvm-svn: 314923
* [NFC] clang-format lib/Transforms/Scalar/MergeICmps.cppClement Courbet2017-10-041-39/+21
| | | | llvm-svn: 314906
* [IRCE] Temporarily disable unsigned latch conditions by defaultMax Kazantsev2017-10-041-0/+21
| | | | | | | | | | | | We have found some corner cases connected to range intersection where IRCE makes a bad thing when the latch condition is unsigned. The fix for that will go as a follow up. This patch temporarily disables IRCE for unsigned latch conditions until the issue is fixed. The unsigned latch conditions were introduced to IRCE by rL310027. Differential Revision: https://reviews.llvm.org/D38529 llvm-svn: 314881
* [InstCombine] Use isSignBitCheck to simplify an if statement. Directly ↵Craig Topper2017-10-031-12/+8
| | | | | | | | create new sign bit compares instead of manipulating the constant. NFCI Since we no longer had the direct constant compares, manipulating the constant seemeded less clear. llvm-svn: 314830
* Revert r314806 "[SLP] Vectorize jumbled memory loads."Hans Wennborg2017-10-031-185/+84
| | | | | | | | | | | | | | | | | | | | | | | | | All the buildbots are red, e.g. http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/2436/ > Summary: > This patch tries to vectorize loads of consecutive memory accesses, accessed > in non-consecutive or jumbled way. An earlier attempt was made with patch D26905 > which was reverted back due to some basic issue with representing the 'use mask' of > jumbled accesses. > > This patch fixes the mask representation by recording the 'use mask' in the usertree entry. > > Change-Id: I9fe7f5045f065d84c126fa307ef6ebe0787296df > > Reviewers: mkuper, loladiro, Ayal, zvi, danielcdh > > Reviewed By: Ayal > > Subscribers: hans, mzolotukhin > > Differential Revision: https://reviews.llvm.org/D36130 llvm-svn: 314824
* Revert the change that accidentally went in r314806.Dehao Chen2017-10-031-4/+0
| | | | llvm-svn: 314807
OpenPOWER on IntegriCloud