summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Improve the expansion in SimplifyUsingDistributiveLaws to ↵Craig Topper2017-07-152-38/+56
| | | | | | | | | | | | | | | | | | | handle cases where one side doesn't simplify, but the other side resolves to an identity value Summary: If one side simplifies to the identity value for inner opcode, we can replace the value with just the operation that can't be simplified. I've removed a couple now unneeded special cases in visitAnd and visitOr. There are probably other cases I missed. Reviewers: spatel, majnemer, hfinkel, dberlin Reviewed By: spatel Subscribers: grandinj, llvm-commits, spatel Differential Revision: https://reviews.llvm.org/D35451 llvm-svn: 308111
* [InstCombine] improve (1 << x) & 1 --> zext(x == 0) foldingSanjay Patel2017-07-151-15/+13
| | | | | | | 1. Add a one-use check to prevent increasing instruction count. 2. Generalize the pattern matching to include vector types. llvm-svn: 308105
* [InstCombine] allow (0 - x) & 1 --> x & 1 for vectorsSanjay Patel2017-07-151-6/+5
| | | | llvm-svn: 308098
* [InstCombine] remove dead code/tests; NFCISanjay Patel2017-07-151-11/+0
| | | | | | | These patterns and tests were added to InstSimplify with: https://reviews.llvm.org/rL303004 llvm-svn: 308096
* Revert r308078 (and subsequent tweak in r308079) which introduces a testChandler Carruth2017-07-151-14/+4
| | | | | | | | | that appears to exhibit non-determinism and is flaking on the bots pretty consistently. r308078: [ThinLTO] Ensure we always select the same function copy to import r308079: Require asserts in new test that uses debug flag llvm-svn: 308095
* [LoopInterchange] Add some optimization remarks.Florian Hahn2017-07-151-9/+110
| | | | | | | | | | | | Reviewers: anemet, karthikthecool, blitz.opensource Reviewed By: anemet Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35122 llvm-svn: 308094
* [SLPVectorizer] Add an extra parameter to tryScheduleBundle function, NFCI.Dinar Temirbulatov2017-07-151-6/+6
| | | | llvm-svn: 308081
* [ThinLTO] Ensure we always select the same function copy to importTeresa Johnson2017-07-151-4/+14
| | | | | | | | | | | | | | | | | | | Summary: Check if the first eligible callee is under the instruction threshold. Checking this on the first eligible callee ensures that we don't end up selecting different callees to import when we invoke this routine with different thresholds due to reaching the callee via paths that are shallower or hotter (when there are multiple copies, i.e. with weak or linkonce linkage). We don't want to leave the decision of which copy to import up to the backend. Reviewers: mehdi_amini Subscribers: inglorion, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35436 llvm-svn: 308078
* [EarlyCSE] Handle calls with no MemorySSA info.Geoff Berry2017-07-141-1/+15
| | | | | | | | | | | | | | | | | | Summary: When checking for memory dependencies between calls using MemorySSA, handle cases where the calls have no MemoryAccess associated with them because the AA analysis being used has determined that the call does not read/write memory. Fixes PR33756 Reviewers: dberlin, davide Subscribers: mcrosier, llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D35317 llvm-svn: 308051
* [JumpThreading] Add a pattern to TryToUnfoldSelectInCurrBB()Haicheng Wu2017-07-141-32/+50
| | | | | | | | | | | | | | | | Add the following pattern to TryToUnfoldSelectInCurrBB() bb: %p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ... %c = cmp %p, 0 %s = select %c, trueval, falseval The Select in the above pattern will be unfolded and then jump-threaded. The current implementation does not allow CMP in the middle of PHI and Select. Differential Revision: https://reviews.llvm.org/D34762 llvm-svn: 308050
* [Dominators] Make IsPostDominator a template parameterJakub Kuderski2017-07-141-4/+7
| | | | | | | | | | | | | | | | | Summary: DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere. This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction. Reviewers: dberlin, sanjoy, davide, grosser Reviewed By: dberlin Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35315 llvm-svn: 308040
* [InstCombine] convert bitwise (in)equality checks to logical ops (PR32401)Sanjay Patel2017-07-141-3/+15
| | | | | | | | | | | | | As discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 we have a backend transform to undo this: https://reviews.llvm.org/rL299542 when it's likely that the xor version leads to better codegen, but we want this form in IR for better analysis and simplification potential. llvm-svn: 308031
* [IRCE] Fix corner case with Start = INT_MAXMax Kazantsev2017-07-141-5/+9
| | | | | | | | | | | | | | | | | | | | When iterating through loop for (int i = INT_MAX; i > 0; i--) We fail to generate the pre-loop for it. It happens because we use the overflown value in a comparison predicate when identifying whether or not we need it. In old logic, we used SLE predicate against Greatest value which exceeds all seen values of the IV and might be overflown. Now we use the GreatestSeen value of this IV with SLT predicate. Also added a test that ensures that a pre-loop is generated for such loops. Differential Revision: https://reviews.llvm.org/D35347 llvm-svn: 308001
* [SLPVectorizer] Add an extra parameter to alreadyVectorized function, NFCI.Dinar Temirbulatov2017-07-141-8/+8
| | | | llvm-svn: 307996
* Fix unused variable warning on EXPENSIVE_CHECKS release builds. NFCI.Simon Pilgrim2017-07-131-1/+1
| | | | llvm-svn: 307929
* Reapply [GlobalOpt] Remove unreachable blocks before optimizing a function.Davide Italiano2017-07-131-0/+18
| | | | | | | This commit reapplies r307215 now that we found out and fixed the cause of the cfi test failure (in r307871). llvm-svn: 307920
* [RuntimeUnrolling] Update DomTree correctly when exit blocks have successorsAnna Thomas2017-07-131-2/+28
| | | | | | | | | | | | | | | | Summary: When we runtime unroll with multiple exit blocks, we also need to update the immediate dominators of the immediate successors of the exit blocks. Reviewers: reames, mkuper, mzolotukhin, apilipenko Reviewed by: mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35304 llvm-svn: 307909
* [PGO] Enhance pgo counter promotionXinliang David Li2017-07-121-42/+115
| | | | | | | | | | | | | | | | | | | | | | | This is an incremental change to the promotion feature. There are two problems with the current behavior: 1) loops with multiple exiting blocks are totally disabled 2) a counter update can only be promoted one level up in the loop nest -- which does help much for short trip count inner loops inside a high trip-count outer loops. Due to this limitation, we still saw very large profile count fluctuations from run to run for the affected loops which are usually very hot. This patch adds the support for promotion counters iteratively across the loop nest. It also turns on the promotion for loops with multiple exiting blocks (with a limit). For single-threaded applications, the performance impact is flat on average. For instance, dealII improves, but povray regresses. llvm-svn: 307863
* [LoopUnrollRuntime] NFC: Refactored safety checks of unrolling multi-exit loopAnna Thomas2017-07-121-47/+58
| | | | | | | | | | | Refactored the code and separated out a function `canSafelyUnrollMultiExitLoop` to reduce redundant checks and make it easier to add profitability heuristics later. Added tests to runtime unrolling to make sure that unrolling for multi-exit loops is not done unless the option -unroll-runtime-multi-exit is true. llvm-svn: 307843
* Remove unneeded use of #undef DEBUG_TYPE. NFCSam Clegg2017-07-122-6/+12
| | | | | | | | | | | Where is is needed (at the end of headers that define it), be consistent about its use. Also fix a few header guards that I found in the process. Differential Revision: https://reviews.llvm.org/D34916 llvm-svn: 307840
* [LV] Don't allow outside uses of IVs if the SCEV is predicated on loop ↵Michael Kuperstein2017-07-121-2/+7
| | | | | | | | | conditions. This fixes PR33706. Differential Revision: https://reviews.llvm.org/D35227 llvm-svn: 307837
* [LoopRotate] Fix DomTree update logic for unreachable nodes. Fix PR33701.Jakub Kuderski2017-07-121-4/+16
| | | | | | | | | | | | | | | | | | | Summary: LoopRotate manually updates the DoomTree by iterating over all predecessors of a basic block and computing the Nearest Common Dominator. When a predecessor happens to be unreachable, `DT.findNearestCommonDominator` returns nullptr. This patch teaches LoopRotate to handle this case and fixes [[ https://bugs.llvm.org/show_bug.cgi?id=33701 | PR33701 ]]. In the future, LoopRotate should be taught to use the new incremental API for updating the DomTree. Reviewers: dberlin, davide, uabelho, grosser Subscribers: efriedma, mzolotukhin Differential Revision: https://reviews.llvm.org/D35074 llvm-svn: 307828
* LowerTypeTests: When importing functions skip definitions where the summary ↵Peter Collingbourne2017-07-121-3/+8
| | | | | | | | | | | | | | contains a decl. This normally indicates mixed CFI + non-CFI compilation, and will result in us treating the function in the same way as a function defined outside of the LTO unit. Part of PR33752. Differential Revision: https://reviews.llvm.org/D35281 llvm-svn: 307744
* [IPO] Temporarily rollback r307215.Davide Italiano2017-07-111-18/+0
| | | | | | | | | [GlobalOpt] Remove unreachable blocks before optimizing a function. While the change is presumably correct, it exposes a latent bug in DI which breaks on of the CFI checks. I'll analyze it further and try to understand what's going on. llvm-svn: 307729
* Enhance synchscope representationKonstantin Zhuravlyov2017-07-117-26/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this replaces *singlethread* keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722
* [LoopUnrollRuntime] NFC: Add some debugging trace messages for why loop ↵Anna Thomas2017-07-111-8/+30
| | | | | | wasn't unrolled. llvm-svn: 307705
* [NewGVN] Check for congruency of memory accesses.Davide Italiano2017-07-111-1/+2
| | | | | | | | | | This is fine as nothing in the code relies on leader and memory leader being the same for a given congruency class. Ack'ed by Dan. Fixes PR33720. llvm-svn: 307699
* [NewGVN] Fix an innocent typo I found while debugging PR33720.Davide Italiano2017-07-111-1/+1
| | | | llvm-svn: 307694
* [NewGVN] Clarify the function invariants formatting them properly.Davide Italiano2017-07-111-3/+4
| | | | llvm-svn: 307692
* [msan] Only check shadow memory for operands that are sized.Evgeniy Stepanov2017-07-111-2/+5
| | | | | | | | | | Fixes PR33347: https://bugs.llvm.org/show_bug.cgi?id=33347. Differential Revision: https://reviews.llvm.org/D35160 Patch by Matt Morehouse. llvm-svn: 307684
* [LoopUnrollRuntime] Avoid multi-exit nested loop with epilog generationAnna Thomas2017-07-111-2/+10
| | | | | | | | | | The loop structure for the outer loop does not contain the epilog preheader when we try to unroll inner loop with multiple exits and epilog code is generated. For now, we just bail out in such cases. Added a test case that shows the problem. Without this bailout, we would trip on assert saying LCSSA form is incorrect for outer loop. llvm-svn: 307676
* [SLPVectorizer] Revert change in cancelScheduling with referencing to ↵Dinar Temirbulatov2017-07-111-1/+1
| | | | | | FirstInBundle, NFCI. llvm-svn: 307667
* fix typos in comments; NFCHiroshi Inoue2017-07-112-2/+2
| | | | llvm-svn: 307626
* [PM/ThinLTO] Fix PR33536, a bug where the ThinLTO bitcode writer wasChandler Carruth2017-07-111-1/+2
| | | | | | | | | | | | | querying for analysis results on a function declaration rather than a definition. The only reason this worked previously is by chance -- because the way we got alias analysis results with the legacy PM, we happened to not compute a dominator tree and so we happened to not hit an assert even though it didn't make any real sense. Now we bail out before trying to compute alias analysis so that we don't hit these asserts. llvm-svn: 307625
* [ConstantHoisting] Remove dupliate logic in constant hoistingLeo Li2017-07-102-34/+11
| | | | | | | | | | | | | | | | | | | | | Summary: As metioned in https://reviews.llvm.org/D34576, checkings in `collectConstantCandidates` can be replaced by using `llvm::canReplaceOperandWithVariable`. The only special case is that `collectConstantCandidates` return false for all `IntrinsicInst` but it is safe for us to collect constant candidates from `IntrinsicInst`. Reviewers: pirama, efriedma, srhines Reviewed By: efriedma Subscribers: llvm-commits, javed.absar Differential Revision: https://reviews.llvm.org/D34921 llvm-svn: 307587
* [NewGVN] Simplify a lambda a little bit. NFCI.Davide Italiano2017-07-101-3/+1
| | | | llvm-svn: 307586
* Fix invalid cast in instcombine UMul/ZExt idiomSerge Guelton2017-07-101-6/+7
| | | | | | | | | | | Fixes https://bugs.llvm.org/show_bug.cgi?id=25454 Do not assume IRBuilder creates Instruction where it can create Value. Do not assume idiom operands are constant, leave generalisation ot the IRBuilder. Differential Revision: https://reviews.llvm.org/D35114 llvm-svn: 307554
* [LoopUnrollRuntime] Remove strict assert about VMap requirementAnna Thomas2017-07-101-4/+3
| | | | | | | | | | | | | When unrolling under multiple exits which is under off-by-default option, the assert that checks for VMap entry in loop exit values is too strong. (assert if VMap entry did not exist, the value should be a constant). However, values derived from constants or from values outside loop, does not have a VMap entry too. Removed the assert and added a testcase showcasing the property for non-constant values. llvm-svn: 307542
* [ArgumentPromotion] Change use of removed argument in llvm.dbg.value to undefMikael Holmen2017-07-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This solves PR33641. When removing a dead argument we must also handle possibly existing calls to llvm.dbg.value that use the removed argument. Now we change the use of the otherwise dead argument to an undef for some other pass to cleanup later. If the calls are left untouched, they will later on cause errors: "function-local metadata used in wrong function" since the ArgumentPromotion rewrites the code by creating a new function with the wanted signature, but the metadata is not recreated so the new function may then erroneously use metadata from the old function. Reviewers: mstorsjo, rnk, arsenm Reviewed By: rnk Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D34874 llvm-svn: 307521
* [IR] Add Type::isIntOrIntVectorTy(unsigned) similar to the existing ↵Craig Topper2017-07-096-20/+18
| | | | | | isIntegerTy(unsigned), but also works for vectors. llvm-svn: 307492
* [IR] Make use of ↵Craig Topper2017-07-095-20/+17
| | | | | | Type::isPtrOrPtrVectorTy/isIntOrIntVectorTy/isFPOrFPVectorTy to shorten code. NFC llvm-svn: 307491
* fix trivial typos; NFCHiroshi Inoue2017-07-091-1/+1
| | | | | | sucessor -> successor llvm-svn: 307488
* [PM] Finish implementing and fix a chain of bugs uncovered by testingChandler Carruth2017-07-091-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the invalidation propagation logic from an SCC to a Function. I wrote the infrastructure to test this but didn't actually use it in the unit test where it was designed to be used. =[ My bad. Once I actually added it to the test case I discovered that it also hadn't been properly implemented, so I've implemented it. The logic in the FAM proxy for an SCC pass to propagate invalidation follows the same ideas as the FAM proxy for a Module pass, but the implementation is a bit different to reflect the fact that it is forwarding just for an SCC. However, implementing this correctly uncovered a surprising "bug" (it was conservatively correct but relatively very expensive) in how we handle invalidation when splitting one SCC into multiple SCCs. We did an eager invalidation when in reality we should be deferring invaliadtion for the *current* SCC to the CGSCC pass manager and just invaliating the newly constructed SCCs. Otherwise we end up invalidating too much too soon. This was exposed by the inliner test case that I've updated. Now, we invalidate *just* the split off '(test1_f)' SCC when doing the CG update, and then the inliner finishes and invalidates the '(test1_g, test1_h)' SCC's analyses. The first few attempts at fixing this hit still more bugs, but all of those are covered by existing tests. For example, the inliner should also preserve the FAM proxy to avoid unnecesasry invalidation, and this is safe because the CG update routines it uses handle any necessary adjustments to the FAM proxy. Finally, the unittests for the CGSCC pass manager needed a bunch of updates where we weren't correctly preserving the FAM proxy because it hadn't been fully implemented and failing to preserve it didn't matter. Note that this doesn't yet fix the current crasher due to MemSSA finding a stale dominator tree, but without this the fix to that crasher doesn't really make any sense when testing because it relies on the proxy behavior. llvm-svn: 307487
* [InstCombine] Speculatively implement a fix for what might be the root cause ↵Craig Topper2017-07-091-1/+2
| | | | | | | | | | of PR33721 by making sure that we have integer types before doing select C, -1, 0 -> sext C to int I recently changed m_One and m_AllOnes to use Constant::isOneValue/isAllOnesValue which work on floating point values too. The original implementation looked specifically for ConstantInt scalars and splats. So I'm guessing we are accidentally trying to issue sext/zexts on floating point types now. Hopefully I figure out how to reproduce the failure from the PR soon. llvm-svn: 307486
* Re-enable "[IndVars] Canonicalize comparisons between non-negative values ↵Max Kazantsev2017-07-081-0/+11
| | | | | | | | | | | | | | and indvars" The patch was reverted due to a bug. The bug was that if the IV is the 2nd operand of the icmp instruction, then the "Pred" variable gets swapped and differs from the instruction's predicate. In this patch we use the original predicate to do the transformation. Also added a test case that exercises this situation. Differentian Revision: https://reviews.llvm.org/D35107 llvm-svn: 307477
* [InstCombine] Make InstCombine's IRBuilder be passed by reference everywhereCraig Topper2017-07-0714-793/+778
| | | | | | | | Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. llvm-svn: 307451
* Increase the import-threshold for crtical functions.Dehao Chen2017-07-071-0/+8
| | | | | | | | | | | | | | Summary: For interative sample-pgo, if a hot call site is inlined in the profiling binary, we should inline it in before profile annotation in the backend. Before that, the compile phase first collects all GUIDs that needs to be imported and creates virtual "hot" call edge in the summary. However, "hot" is not good enough to guarantee the callsites get inlined. This patch introduces "critical" call edge, and assign much higher importing threshold for those edges. Reviewers: tejohnson Reviewed By: tejohnson Subscribers: sanjoy, mehdi_amini, llvm-commits, eraman Differential Revision: https://reviews.llvm.org/D35096 llvm-svn: 307439
* [LoopUnrollRuntime] Support multiple exit blocks unrolling when prolog ↵Anna Thomas2017-07-071-2/+0
| | | | | | | | | | | | | | | remainder generated With the NFC refactoring in rL307417 (git SHA 987dd01), all the logic is in place to support multiple exit/exiting blocks when prolog remainder is generated. This patch removed the assert that multiple exit blocks unrolling is only supported when epilog remainder is generated. Also, added test runs and checks with PROLOG prefix in runtime-loop-multiple-exits.ll test cases. llvm-svn: 307435
* [Local] Update the comment for removeUnreachableBlocks.Davide Italiano2017-07-071-2/+3
| | | | | | | It referenced a wrong function name, and didn't mention what the second argument did. This should be slightly more accurate now. llvm-svn: 307425
* [cloning] Do not duplicate types when cloning functionsGor Nishanov2017-07-071-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is an addon to the change rl304488 cloning fixes. (Originally rl304226 reverted rl304228 and reapplied rl304488 https://reviews.llvm.org/D33655) rl304488 works great when DILocalVariables that comes from the inlined function has a 'unique-ed' type, but, in the case when the variable type is distinct we will create a second DILocalVariable in the scope of the original function that was inlined. Consider cloning of the following function: ``` define private void @f() !dbg !5 { %1 = alloca i32, !dbg !11 call void @llvm.dbg.declare(metadata i32* %1, metadata !14, metadata !12), !dbg !18 ret void, !dbg !18 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) ; came from an inlined function !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Without this fix, when function 'f' is cloned, we will create another DILocalVariable for "inlined", due to its type being distinct. ``` define private void @f.1() !dbg !23 { %1 = alloca i32, !dbg !26 call void @llvm.dbg.declare(metadata i32* %1, metadata !28, metadata !12), !dbg !30 ret void, !dbg !30 } !14 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !17) !15 = distinct !DISubprogram(name: "inlined", linkageName: "inlined", scope: null, file: !6, line: 8, type: !7, isLocal: true, isDefinition: true, scopeLine: 9, isOptimized: false, unit: !0, variables: !16) !16 = !{!14} !17 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ; !28 = !DILocalVariable(name: "inlined", scope: !15, file: !6, line: 5, type: !29) ; OOPS second DILocalVariable !29 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "some_struct", size: 32, align: 32) ``` Now we have two DILocalVariable for "inlined" within the same scope. This result in assert in AsmPrinter/DwarfDebug.h:131: void llvm::DbgVariable::addMMIEntry(const llvm::DbgVariable &): Assertion `V.Var == Var && "conflicting variable"' failed. (Full example: See: https://bugs.llvm.org/show_bug.cgi?id=33492) In this change we prevent duplication of types so that when a metadata for DILocalVariable is cloned it will get uniqued to the same metadate node as an original variable. Reviewers: loladiro, dblaikie, aprantl, echristo Reviewed By: loladiro Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D35106 llvm-svn: 307418
OpenPOWER on IntegriCloud