summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/CodeGenPrepare.cpp
Commit message (Collapse)AuthorAgeFilesLines
* [CodeGen] Fix the computation of the alignment of split stores.Hans Wennborg2020-02-121-2/+10
| | | | | | By Clement Courbet! Backported from rG15488ff24b4a
* [PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst.Hiroshi Yamauchi2020-01-231-0/+1
| | | | | | | | | | | | | | | | | Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146 (cherry picked from commit ddbc728828c70728473b47c9f7427aa9514f3d17)
* [CodegenPrepare] Guard against degenerate branchesValentin Churavy2019-12-161-0/+4
| | | | | | | | | | | | | | | Summary: Guard against a potential crash observed in https://github.com/JuliaLang/julia/issues/32994#issuecomment-524249628 If two branches are collapsed we can encounter a degenerate conditional branch `TBB==FBB`. The subsequent code assumes that they differ, so we exit out early. Reviewers: ributzka, spatel Subscribers: loladiro, dexonsmith, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66657
* [IR] Split out target specific intrinsic enums into separate headersReid Kleckner2019-12-111-0/+2
| | | | | | | | | | | | | | | | | | | | This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-091-20/+38
| | | | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
* [DebugInfo] Nerf placeDbgValues, with prejudiceJeremy Morse2019-12-091-27/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453
* Revert "[PGO][PGSO] Instrument the code gen / target passes."Hiroshi Yamauchi2019-12-061-38/+20
| | | | | | This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac. This seems to break buildbots.
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-061-20/+38
| | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072
* [DebugInfo][CGP] Update dbg.values when sinking address computationsJeremy Morse2019-12-061-0/+25
| | | | | | | | | | | | | | | | | One of CodeGenPrepare's optimizations is to duplicate address calculations into basic blocks, so that as much information as possible can be folded into memory addressing operands. This is great -- but the dbg.value variable location intrinsics are not updated in the same way. This can lead to dbg.values referring to address computations in other blocks that will never be encoded into the DAG, while duplicate address computations are performed locally that could be used by the dbg.value. Some of these (such as non-constant-offset GEPs) can't be salvaged past. Fix this by, whenever we duplicate an address computation into a block, looking for dbg.value users of the original memory address in the same block, and redirecting those to the local computation. Differential Revision: https://reviews.llvm.org/D58403
* Fix a typo.Hans Wennborg2019-11-301-1/+1
|
* Sink all InitializePasses.h includesReid Kleckner2019-11-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211
* [CGP] Make ICMP_EQ use CR result of ICMP_S(L|G)T dominatorsYi-Hong Lyu2019-11-111-0/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example: long long test(long long a, long long b) { if (a << b > 0) return b; if (a << b < 0) return a; return a*b; } Produces: sld. 5, 3, 4 ble 0, .LBB0_2 mr 3, 4 blr .LBB0_2: # %if.end cmpldi 5, 0 li 5, 1 isel 4, 4, 5, 2 mulld 3, 4, 3 blr But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already contains the result of that comparison). The root cause of this is that LLVM converts signed comparisons into equality comparison based on dominance. Equality comparisons are unsigned by default, so we get either a record-form or cmp (without the l for logical) feeding a cmpl. That is the situation we want to avoid here. Differential Revision: https://reviews.llvm.org/D60506
* [Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)Guillaume Chatelet2019-10-151-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68944 llvm-svn: 374880
* Reapply r374743 with a fix for the ocaml bindingJoerg Sonnenberger2019-10-141-18/+4
| | | | | | | | | | | | | | | | | | | Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784
* Revert "Add a pass to lower is.constant and objectsize intrinsics"Dmitri Gribenko2019-10-141-4/+18
| | | | | | | This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768
* Add a pass to lower is.constant and objectsize intrinsicsJoerg Sonnenberger2019-10-131-18/+4
| | | | | | | | | | | | | | | | | This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743
* CodeGenPrepare - silence static analyzer dyn_cast<> null dereference ↵Simon Pilgrim2019-10-081-11/+8
| | | | | | | | warnings. NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 374085
* [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)Guillaume Chatelet2019-09-301-1/+1
| | | | | | | | | | | | | | | | | Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207
* [CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix.Jesper Antonsson2019-09-271-1/+1
| | | | | | | | | | | | | | | | | | | Summary: An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst() in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was: https://bugs.llvm.org/show_bug.cgi?id=41052 ... and the previous fix was D59358. Reviewers: aprantl, skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67838 llvm-svn: 373084
* Revert [IR] allow fast-math-flags on phi of FP valuesSanjay Patel2019-09-251-1/+1
| | | | | | This reverts r372866 (git commit dec03223a97af0e4dfcb23da55c0f7f8c9b62d00) llvm-svn: 372868
* [IR] allow fast-math-flags on phi of FP valuesSanjay Patel2019-09-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The changes here are based on the corresponding diffs for allowing FMF on 'select': D61917 As discussed there, we want to have fast-math-flags be a property of an FP value because the alternative (having them on things like fcmp) leads to logical inconsistency such as: https://bugs.llvm.org/show_bug.cgi?id=38086 The earlier patch for select made almost no practical difference because most unoptimized conditional code begins life as a phi (based on what I see in clang). Similarly, I don't expect this patch to do much on its own either because SimplifyCFG promptly drops the flags when converting to select on a minimal example like: https://bugs.llvm.org/show_bug.cgi?id=39535 But once we have this plumbing in place, we should be able to wire up the FMF propagation and start solving cases like that. The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a regression in a LoopVectorize test. We are intersecting the FMF of any FPMathOperator there, so if a phi is not properly annotated, new math instructions may not be either. Once we fix the propagation in SimplifyCFG, it may be safe to remove that hack. Differential Revision: https://reviews.llvm.org/D67564 llvm-svn: 372866
* [CGP] Ensure sinking multiple instructions does not invalidate dominance checksDavid Green2019-09-121-9/+23
| | | | | | | | | | | | | | | | | | In MVE, as of rL371218, we are attempting to sink chains of instructions such as: %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0 %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer In certain situations though, we can end up breaking the dominance relations of instructions. This happens when we sink the instruction into a loop, but cannot remove the originals. The Use is updated, which might in fact be a Use from the second instruction to the first. This attempts to fix that by reversing the order of instruction that are sunk, and ensuring that we update the uses on new instructions if they have already been sunk, not the old ones. Differential Revision: https://reviews.llvm.org/D67366 llvm-svn: 371743
* CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI.Tim Northover2019-09-121-2/+2
| | | | | | | | | Up to now, we've decided whether to sink address calculations using GEPs or normal arithmetic based on the useAA hook, but there are other reasons GEPs might be preferred. So this patch splits the two questions, with a default implementation falling back to useAA. llvm-svn: 371721
* Change TargetLibraryInfo analysis passes to always require FunctionTeresa Johnson2019-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284
* [CGP] Remove ModifiedDT from the makeBitReverse loopCraig Topper2019-08-191-1/+0
| | | | | | | | | | I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283
* [CodeGenPrepare] Fix use-after-freeSanjay Patel2019-08-161-1/+2
| | | | | | | | | | | | | | | | | | | If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168
* [llvm] Migrate llvm::make_unique to std::make_uniqueJonas Devlieghere2019-08-151-6/+6
| | | | | | | | Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013
* [NFC][CodeGen] Modify the type element of TailCalls to simplify the ↵Kang Zhang2019-08-021-12/+9
| | | | | | | | | | | | | dupRetToEnableTailCallOpts() Summary: The old code can be simplified to define the element type of TailCalls as `BasicBlock` not `CallInst`. Also I use the for-range loop instead the for loop. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D64905 llvm-svn: 367644
* [DAGCombiner] [CodeGenPrepare] More comprehensive GEP splittingLuis Marques2019-06-171-9/+4
| | | | | | | | | | | | | | | Some GEPs were not being split, presumably because that split would just be undone by the DAGCombiner. Not performing those splits can prevent important optimizations, such as preventing the element indices / member offsets from being (partially) folded into load/store instruction immediates. This patch: - Makes the splits also occur in the cases where the base address and the GEP are in the same BB. - Ensures that the DAGCombiner doesn't reassociate them back again. Differential Revision: https://reviews.llvm.org/D60294 llvm-svn: 363544
* [CodeGenPrepare][x86] shift both sides of a vector select when profitableSanjay Patel2019-06-161-3/+42
| | | | | | | | | | | | | | | | | This is based on the example/discussion in PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 Proper vector shift instructions don't appear until AVX2, so we may generate several extra instructions within a loop trying to compensate for that. It's difficult to recover from that shift expansion later than this, so use the existing TLI hook and splat analysis to enable better codegen. This extends CGP functionality introduced with: rL201655 Differential Revision: https://reviews.llvm.org/D63233 llvm-svn: 363511
* [CodeGenPrepare] propagate debuginfo when copying a shuffleSanjay Patel2019-06-141-0/+1
| | | | llvm-svn: 363409
* TTI: Improve default costs for addrspacecastMatt Arsenault2019-06-031-2/+2
| | | | | | | | | | For some reason multiple places need to do this, and the variant the loop unroller and inliner use was not handling it. Also, introduce a new wrapper to be slightly more precise, since on AMDGPU some addrspacecasts are free, but not no-ops. llvm-svn: 362436
* Use the DataLayout::typeSizeEqualsStoreSize helper. NFCBjorn Pettersson2019-05-241-3/+2
| | | | | | | | | Just a minor refactoring to use the new helper method DataLayout::typeSizeEqualsStoreSize(). This is done when checking if getTypeSizeInBits is equal/non-equal to getTypeStoreSizeInBits. llvm-svn: 361613
* [CodeGenPrepare] Ensure we get a non-null result from getTrueOrFalseValue. NFCI.Simon Pilgrim2019-05-091-1/+3
| | | | llvm-svn: 360328
* [CodeGenPrepare] Don't split the store if it is volatileQingShan Zhang2019-05-081-0/+4
| | | | | | | | We shouldn't split the store when it is volatile. Differential Revision: https://reviews.llvm.org/D61169 llvm-svn: 360228
* [NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use itRoman Lebedev2019-05-051-2/+1
| | | | | | | | | | | | | | | | | | | | | Summary: It is a common thing to loop over every `PHINode` in some `BasicBlock` and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block. `replaceSuccessorsPhiUsesWith()` already had code to do that, it just wasn't a function. So outline it into a new function, and use it. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61013 llvm-svn: 359996
* [NFC] PHINode: introduce replaceIncomingBlockWith() function, use itRoman Lebedev2019-05-051-5/+2
| | | | | | | | | | | | | | | | | | | | Summary: There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()` and `PHINode::getNumOperands()`, but no function to replace every specified `BasicBlock*` predecessor with some other specified `BasicBlock*`. Clearly, there are a lot of places that could use that functionality. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61011 llvm-svn: 359995
* [CodeGenPrepare] limit overflow intrinsic matching to a single basic block ↵Sanjay Patel2019-05-041-28/+21
| | | | | | | | | | | | | | | | | | | | | (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969
* Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic ↵Evgeniy Stepanov2019-05-031-42/+47
| | | | | | | | block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908
* [CodeGenPrepare] limit overflow intrinsic matching to a single basic blockSanjay Patel2019-05-031-47/+42
| | | | | | | | | | | | | | | | | | | Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. Also, we were restarting the iterator loops when doing the overflow intrinsic transforms by marking the dominator tree for update. That was done to prevent iterating over a removed instruction. But we can postpone the deletion using the existing "RemovedInsts" structure, and that means we don't need to update the DT. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359879
* [CGP] Look through bitcasts when duplicating returns for tail callsFrancis Visoiu Mistrih2019-04-231-1/+3
| | | | | | | | | | | | | | | | | | | | | The simple case of: ``` int *callee(); void *caller(void *a) { if (a == NULL) return callee(); return a; } ``` would generate a regular call instead of a tail call because we don't look through the bitcast of the call to `callee` when duplicating the return blocks. Differential Revision: https://reviews.llvm.org/D60837 llvm-svn: 359041
* Include what's used in a few cpp files - these were getting transitiveEric Christopher2019-04-121-0/+1
| | | | | | includes from MCDwarf.h. llvm-svn: 358254
* [IR] Refactor attribute methods in Function class (NFC)Evandro Menezes2019-04-041-2/+2
| | | | | | | | Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731
* [CGP] Reset DT when optimizing select instructionsTeresa Johnson2019-03-271-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: A recent fix (r355751) caused a compile time regression because setting the ModifiedDT flag in optimizeSelectInst means that each time a select instruction is optimized the function walk in runOnFunction stops and restarts again (which was needed to build a new DT before we started building it lazily in r356937). Now that the DT is built lazily, a simple fix is to just reset the DT at this point, rather than restarting the whole function walk. In the future other places that set ModifiedDT may want to switch to just resetting the DT directly. But that will require an evaluation to ensure that they don't otherwise need to restart the function walk. Reviewers: spatel Subscribers: jdoerfert, llvm-commits, xur Tags: #llvm Differential Revision: https://reviews.llvm.org/D59889 llvm-svn: 357111
* [CGP] Build the DominatorTree lazilyTeresa Johnson2019-03-251-34/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In r355512 CGP was changed to build the DominatorTree only once per function traversal, to avoid repeatedly building it each time it was accessed. This solved one compile time issue but introduced another. In the second case, we now were building the DT unnecessarily many times when we performed many function traversals (i.e. more than once per function when running CGP because of changes made each time). Change to saving the DT in the CodeGenPrepare object, and building it lazily when needed. It is reset whenever we need to rebuild it. The case that exposed the issue there are 617 functions, and we walk them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a total of 12083 times (so previously we were building the DT 12083 times). With this patch we only build the DT 844 times (average of 1.37 times per function). We dropped the total time to compile this file from 538.11s without this patch to 339.63s with it. There is still an issue as CGP is taking much longer than all other passes even with this patch, and before a recent compiler release cut at r355392 the total time to this compile was only 97 sec with a huge reduction in CGP time. I suspect that one of the other recent changes to CGP led to iterating each function many more times on average, but I need to do some more investigation. Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59696 llvm-svn: 356937
* [CGP] Make several static functions member functions (NFC)Teresa Johnson2019-03-241-19/+25
| | | | | | | This is extracted from D59696 as suggested in the review. It is preparation for making the DominatorTree a member variable. llvm-svn: 356857
* [CodeGenPrepare] limit formation of overflow intrinsics (PR41129)Sanjay Patel2019-03-211-2/+6
| | | | | | | | | | | | | | | | | | This is probably a bigger limitation than necessary, but since we don't have any evidence yet that this transform led to real-world perf improvements rather than regressions, I'm making a quick, blunt fix. In the motivating x86 example from: https://bugs.llvm.org/show_bug.cgi?id=41129 ...and shown in the regression test, we want to avoid an extra instruction in the dominating block because that could be costly. The x86 LSR test diff is reversing the changes from D57789. There's no evidence that 1 version is any better than the other yet. Differential Revision: https://reviews.llvm.org/D59602 llvm-svn: 356665
* [CGP] fix formatting; NFCSanjay Patel2019-03-201-3/+4
| | | | llvm-svn: 356572
* [CGP] convert chain of 'if' to 'switch'; NFCSanjay Patel2019-03-201-14/+13
| | | | | | | | This should be extended, but CGP does some strange things, so I'm intentionally not changing the potential order of any transforms yet. llvm-svn: 356566
* [CodeGenPrepare] avoid crashing from replacing a phi twiceMikael Holmen2019-03-151-2/+6
| | | | | | | | | | | | | | | | | | | | | | Summary: This is a fix to bug 41052: https://bugs.llvm.org/show_bug.cgi?id=41052 While trying to optimize a memory instruction in a dead basic block, we end up registering the same phi for replacement twice. This patch avoids registering more than the first replacement candidate for a phi. Patch by: JesperAntonsson Reviewers: skatkov, aprantl Reviewed By: aprantl Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59358 llvm-svn: 356260
OpenPOWER on IntegriCloud