summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Rename variables Demanded -> DemandedBits/DemandedElts. NFCI.Simon Pilgrim2019-04-031-9/+10
| | | | | | Use consistent variable names down the SimplifyDemanded* call stack so debugging isn't such a annoyance. llvm-svn: 357602
* [DAGCombiner] loosen restrictions for moving shuffles after vector binopSanjay Patel2019-04-031-16/+19
| | | | | | | | | | | | There are 3 changes to make this correspond to the same transform in instcombine: 1. Remove the legality check - we can't create anything less legal than we started with. 2. Ease the use restriction, so we only bail out if both operands have >1 use. 3. Ease the use restriction for binops with a repeated operand (eg, mul x, x). As discussed in D60150, there's a scalarization opportunity that will be made easier by allowing this transform more generally. llvm-svn: 357580
* [DAGCombine] Don't use getZExtValue() until we know the constant is in range.Simon Pilgrim2019-04-031-2/+2
| | | | | | Noticed during prep for a patch for PR40758. llvm-svn: 357571
* Revert r357256 "[DAGCombine] Improve Lifetime node chains."Hans Wennborg2019-04-031-31/+0
| | | | | | | | | | | | | | | | | | | | | | | | As it caused a pathological compile-time regressionin V8, see PR41352. > Improve both start and end lifetime nodes chain dependencies. > > Reviewers: courbet > > Reviewed By: courbet > > Subscribers: hiraditya, llvm-commits > > Tags: #llvm > > Differential Revision: https://reviews.llvm.org/D59795 This also reverts the follow-up r357309: > [DAGCombiner] Rewrite ImproveLifetimeNodeChain to avoid DAG loop. > > Avoid EXPENSIVE_CHECK failure. NFCI. llvm-svn: 357563
* [GlobalISel] Add IRTranslator support for llvm.stacksave and llvm.stackrestoreJessica Paquette2019-04-021-0/+28
| | | | | | | | Also update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60140 llvm-svn: 357538
* [DAGCombiner] reduce code duplication; NFCSanjay Patel2019-04-021-8/+8
| | | | llvm-svn: 357498
* Enforce StackID definition in PEISander de Smalen2019-04-023-7/+35
| | | | | | | | | | | | | | | There are various places in LLVM where the definition of StackID is not properly honoured, for example in PEI where objects with a StackID > 0 are allocated on the default stack (StackID0). This patch enforces that PEI only considers allocating objects to StackID 0. Reviewers: arsenm, thegameg, MatzeB Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D60062 llvm-svn: 357460
* Add an optional list of blocks to avoid when looking for a path in ↵Nick Lewycky2019-04-021-1/+1
| | | | | | | | | | isPotentiallyReachable. The leads to some ambiguous overloads, so update three callers. Differential Revision: https://reviews.llvm.org/D60085 llvm-svn: 357447
* [RISCV] Generate address sequences suitable for mcmodel=mediumAlex Bradbury2019-04-011-1/+4
| | | | | | | | | | | | | | | | | This patch adds an implementation of a PC-relative addressing sequence to be used when -mcmodel=medium is specified. With absolute addressing, a 'medium' codemodel may cause addresses to be out of range. This is because while 'medium' implies a 2 GiB addressing range, this 2 GiB can be at any offset as opposed to 'small', which implies the first 2 GiB only. Note that LLVM/Clang currently specifies code models differently to GCC, where small and medium imply the same functionality as GCC's medlow and medany respectively. Differential Revision: https://reviews.llvm.org/D54143 Patch by Lewis Revill. llvm-svn: 357393
* [DAGCombiner] Rewrite ImproveLifetimeNodeChain to avoid DAG loop.Nirav Dave2019-03-291-8/+9
| | | | | | Avoid EXPENSIVE_CHECK failure. NFCI. llvm-svn: 357309
* [DAG] Avoid redundancy in StoreMerge TokenFactor generation.Nirav Dave2019-03-291-2/+2
| | | | | | | Avoid generating redundant TokenFactor when all merged stores have the same chain. llvm-svn: 357299
* [DAGCombine] Prune unnused nodes.Nirav Dave2019-03-291-15/+48
| | | | | | | | | | | | | | | | | | | Summary: Nodes that have no uses are eventually pruned when they are selected from the worklist. Record nodes newly added to the worklist or DAG and perform pruning after every combine attempt. Reviewers: efriedma, RKSimon, craig.topper, spatel, jyknight Reviewed By: jyknight Subscribers: jdoerfert, jyknight, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58070 llvm-svn: 357283
* [CodeGen] Refactor the option for the maximum jump table sizeEvandro Menezes2019-03-291-2/+2
| | | | | | | Refactor the option `max-jump-table-size` to default to the maximum representable number. Essentially, NFC. llvm-svn: 357280
* [DAG] Set up infrastructure to avoid smart constructor-based dangling nodesNirav Dave2019-03-292-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations without fully pruning unused result values. This results in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter for the combiner to make sure such nodes have the chance of being pruned. This allows a number of additional peephole optimizations. Reviewers: efriedma, RKSimon, craig.topper, jyknight Reviewed By: jyknight Subscribers: msearles, jyknight, sdardis, nemanjai, javed.absar, hiraditya, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58068 llvm-svn: 357279
* [DAGCombiner] simplify shuffle of shuffleSanjay Patel2019-03-291-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | After investigating the examples from D59777 targeting an SSE4.1 machine, it looks like a very different problem due to how we map illegal types (256-bit in these cases). We're missing a shuffle simplification that maps elements of a vector back to a shuffled operand. We have a more general version of this transform in DAGCombiner::visitVECTOR_SHUFFLE(), but that generality means it is limited to patterns with a one-use constraint, and the examples here have 2 uses. We don't need any uses or legality limitations for a simplification (no new value is created). It looks like we miss this pattern in IR too. In one of the zext examples here, we have shuffle masks like this: Shuf0 = vector_shuffle<0,u,3,7,0,u,3,7> Shuf = vector_shuffle<4,u,6,7,u,u,u,u> ...so that's moving the high half of the 1st vector into the low half. But the high half of the 1st vector is already identical to the low half. Differential Revision: https://reviews.llvm.org/D59961 llvm-svn: 357258
* [DAGCombine] Improve Lifetime node chains.Nirav Dave2019-03-291-0/+30
| | | | | | | | | | | | | | | | Improve both start and end lifetime nodes chain dependencies. Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59795 llvm-svn: 357256
* [DAGCombiner] fold sext into decrementSanjay Patel2019-03-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | This is a sibling to rL357178 that I noticed we'd hit if we chose an alternate transform in D59818. %z = zext i8 %x to i32 %dec = add i32 %z, -1 %r = sext i32 %dec to i64 => %z2 = zext i8 %x to i64 %r = add i64 %z2, -1 https://rise4fun.com/Alive/kPP The x86 vector diffs show a slight regression, so there's a chance that we should limit this and the previous transform to scalars. But given that we allowed vectors before, I'm matching that behavior here. We should change both transforms together if that's the right thing to do. llvm-svn: 357254
* Switch lowering: exploit unreachable fall-through when lowering case range ↵Hans Wennborg2019-03-292-3/+23
| | | | | | | | | | | | | | | | | | | | cluster In the example below, we would previously emit two range checks, one for cases 1--3 and one for 4--6. This patch makes us exploit the fact that the fall-through is unreachable and only one range check is necessary. switch i32 %i, label %default [ i32 1, label %bb1 i32 2, label %bb1 i32 3, label %bb1 i32 4, label %bb2 i32 5, label %bb2 i32 6, label %bb2 ] default: unreachable llvm-svn: 357252
* [ScheduleDAG] Move `Topo` and `addEdge` to base class.Clement Courbet2019-03-293-34/+28
| | | | | | | | | Some DAG mutations can only be applied to `ScheduleDAGMI`, and have to internally cast a `ScheduleDAGInstrs` to `ScheduleDAGMI`. There is nothing actually specific to `ScheduleDAGMI` in `Topo`. llvm-svn: 357239
* [SelectionDAGBuilder] Fix 80 column violation. NFCCraig Topper2019-03-281-1/+2
| | | | llvm-svn: 357213
* [InterleavedAccessPass] Don't increase the number of bytes loaded.Eli Friedman2019-03-281-3/+9
| | | | | | | | | | | | | | | | | Even if the interleaving transform would otherwise be legal, we shouldn't introduce an interleaved load that is wider than the original load: it might have undefined behavior. It might be possible to perform some sort of mask-narrowing transform in some cases (using a narrower interleaved load, then extending the results using shufflevectors). But I haven't tried to implement that, at least for now. Fixes https://bugs.llvm.org/show_bug.cgi?id=41245 . Differential Revision: https://reviews.llvm.org/D59954 llvm-svn: 357212
* [DAG] Fix Lifetime Node ID hashing.Nirav Dave2019-03-281-0/+7
| | | | llvm-svn: 357179
* [DAGCombiner] fold sext into negationSanjay Patel2019-03-281-0/+10
| | | | | | | | | | | | | | As noted in D59818: %z = zext i8 %x to i32 %neg = sub i32 0, %z %r = sext i32 %neg to i64 => %z2 = zext i8 %x to i64 %r = sub i64 0, %z2 https://rise4fun.com/Alive/KzSR llvm-svn: 357178
* [DAGCombiner] Fold truncate(build_vector(x,y)) -> ↵Simon Pilgrim2019-03-281-1/+15
| | | | | | | | | | | | build_vector(truncate(x),truncate(y)) If scalar truncates are free, attempt to pre-truncate build_vectors source operands. Only attempt to do this before legalization as we often end up with truncations/extensions during build_vector lowering. Differential Revision: https://reviews.llvm.org/D59654 llvm-svn: 357161
* [DAGCombiner] Teach TokenFactor pruning to peek through lifetime nodesNirav Dave2019-03-271-0/+2
| | | | | | | | | | | | | | Summary: Lifetime nodes were inhibiting TokenFactor simplification inhibiting chain-based optimizations. Reviewers: courbet, jyknight Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59897 llvm-svn: 357121
* [LegalizeVectorTypes] Allow single loads and stores for more short vectorsJustin Bogner2019-03-271-1/+6
| | | | | | | | | | | | | | | | | | | When lowering a load or store for TypeWidenVector, the type legalizer would use a single load or store if the associated integer type was legal or promoted. E.g. it loads a v4i8 as an i32 if i32 is legal/promotable. (See https://reviews.llvm.org/rL236528 for reference.) This applies that behaviour to vector types. If the vector type is TypePromoteInteger, the element type is going to be TypePromoteInteger as well, which will lead to have a single promoting load rather than N individual promoting loads. For instance, if we have a v3i1, we would now have a load of v4i1 instead of 3 loads of i1. Patch by Guillaume Marques. Thanks! Differential Revision: https://reviews.llvm.org/D56201 llvm-svn: 357120
* Revert r356996 "[DAG] Avoid smart constructor-based dangling nodes."Nirav Dave2019-03-272-15/+0
| | | | | | | This patch appears to trigger very large compile time increases in halide builds. llvm-svn: 357116
* [CGP] Reset DT when optimizing select instructionsTeresa Johnson2019-03-271-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: A recent fix (r355751) caused a compile time regression because setting the ModifiedDT flag in optimizeSelectInst means that each time a select instruction is optimized the function walk in runOnFunction stops and restarts again (which was needed to build a new DT before we started building it lazily in r356937). Now that the DT is built lazily, a simple fix is to just reset the DT at this point, rather than restarting the whole function walk. In the future other places that set ModifiedDT may want to switch to just resetting the DT directly. But that will require an evaluation to ensure that they don't otherwise need to restart the function walk. Reviewers: spatel Subscribers: jdoerfert, llvm-commits, xur Tags: #llvm Differential Revision: https://reviews.llvm.org/D59889 llvm-svn: 357111
* [ConstantRange] Rename isWrappedSet() to isUpperWrapped()Nikita Popov2019-03-271-1/+1
| | | | | | | | | | | | | | Split out from D59749. The current implementation of isWrappedSet() doesn't do what it says on the tin, and treats ranges like [X, Max] as wrapping, because they are represented as [X, 0) when using half-inclusive ranges. This also makes it inconsistent with the semantics of isSignWrappedSet(). This patch renames isWrappedSet() to isUpperWrapped(), in preparation for the introduction of a new isWrappedSet() method with corrected behavior. llvm-svn: 357107
* RegPressure: Fix crash on blocks with only dbg_valueMatt Arsenault2019-03-271-1/+7
| | | | | | | | If there were only dbg_values in the block, recede would hit the beginning of the block and try to use thet dbg_value as a real instruction. llvm-svn: 357105
* [GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead ↵Amara Emerson2019-03-271-1/+2
| | | | | | | | | | | | | | | | | | | | instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101
* [PeepholeOpt] Don't stop simplifying copies on sequence of subregsQuentin Colombet2019-03-271-6/+1
| | | | | | | | | | | | | | | This patch removes an overly conservative check that would prevent simplifying copies when the value we were tracking would go through several subregister indices. Indeed, the intend of this check was to not track values whenever we have to compose subregister, but actually what the check was doing was bailing anytime we see a second subreg, even if that second subreg would actually be the new source of truth (as opposed to a part of that subreg). Differential Revision: https://reviews.llvm.org/D59891 llvm-svn: 357095
* PEI: Delay checking requiresFrameIndexReplacementScavengingMatt Arsenault2019-03-271-4/+10
| | | | | | | | | | Currently this is called before the frame size is set on the function. For AMDGPU, the scavenger is used for large frames where part of the offset needs to be materialized in a register, so estimating the frame size is useful for knowing whether the scavenger is useful. llvm-svn: 357087
* MIR: Freeze reserved regs after parsing everythingMatt Arsenault2019-03-271-3/+8
| | | | | | | | | | | | The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. llvm-svn: 357083
* [DAGCombiner] Unify Lifetime and memory Op aliasing.Nirav Dave2019-03-272-79/+120
| | | | | | | | | | | | | | | | | | | Rework BaseIndexOffset and isAlias to fully work with lifetime nodes and fold in lifetime alias analysis. This is mostly NFC. Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59794 llvm-svn: 357070
* [DAGCombine] Refactor GatherAllAliases. NFCI.Nirav Dave2019-03-271-65/+66
| | | | llvm-svn: 357069
* Re-commit r355490 "[CodeGen] Omit range checks from jump tables when ↵Hans Wennborg2019-03-272-55/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | lowering switches with unreachable default" Original commit by Ayonam Ray. This commit adds a regression test for the issue discovered in the previous commit: that the range check for the jump table can only be omitted if the fall-through destination of the jump table is unreachable, which isn't necessarily true just because the default of the switch is unreachable. This addresses the missing optimization in PR41242. > During the lowering of a switch that would result in the generation of a > jump table, a range check is performed before indexing into the jump > table, for the switch value being outside the jump table range and a > conditional branch is inserted to jump to the default block. In case the > default block is unreachable, this conditional jump can be omitted. This > patch implements omitting this conditional branch for unreachable > defaults. > > Differential Revision: https://reviews.llvm.org/D52002 > Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 357067
* [DAGCombiner] Don't allow addcarry if the carry producer is illegal.Jonas Paulsson2019-03-271-0/+4
| | | | | | | | | | | | | | | getAsCarry() checks that the input argument is a carry-producing node before allowing a transformation to addcarry. This patch adds a check to make sure that the carry-producing node is legal. If it is not, it may not remain in a form that is manageable by the target backend. The test case caused a compilation failure during instruction selection for this reason on SystemZ. Patch by Ulrich Weigand. Review: Sanjay Patel https://reviews.llvm.org/D59822 llvm-svn: 357052
* [Remarks] Emit a section containing remark diagnostics metadataFrancis Visoiu Mistrih2019-03-271-0/+45
| | | | | | | | | | | | | | | | A section containing metadata on remark diagnostics will be emitted if the flag (-mllvm) -remarks-section is present. For now, the metadata is: * a magic number for remarks: "REMARKS\0" * the version number: a little-endian uint64_t * the absolute file path to the serialized remark diagnostics: a null-terminated string. Differential Revision: https://reviews.llvm.org/D59571 llvm-svn: 357043
* [LiveRange] Reset the VNIs when splitting subrangesQuentin Colombet2019-03-264-32/+87
| | | | | | | | | | | | | | | When splitting a subrange we end up with two different subranges covering two different, non overlapping, lanes. As part of this splitting the VNIs of the original live-range need to be dispatched to the subranges according to which lanes they are actually defining. Prior to this patch we were assuming that all values were defining all lanes. This was wrong as demonstrated by llvm.org/PR40835. Differential Revision: https://reviews.llvm.org/D59731 llvm-svn: 357032
* [SDAG] add simplifications for FP at node creation timeSanjay Patel2019-03-261-0/+27
| | | | | | | | We have the folds for fadd/fsub/fmul already in DAGCombiner, so it may be possible to remove that code if we can guarantee that these ops are zapped before they can exist. llvm-svn: 357029
* Revert "[llvm] Reapply "Prevent duplicate files in debug line header in ↵Ali Tamur2019-03-263-2/+1
| | | | | | | | | | | | | dwarf 5."" This reverts commit rL357020. The commit broke the test llvm/test/tools/llvm-objdump/embedded-source.test on some builds including clang-ppc64be-linux-multistage, clang-s390x-linux, clang-with-lto-ubuntu, clang-x64-windows-msvc, llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast (and others). llvm-svn: 357026
* [llvm] Reapply "Prevent duplicate files in debug line header in dwarf 5."Ali Tamur2019-03-263-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reapply rL356941 after regenerating the object file in the failing test llvm/test/tools/llvm-objdump/embedded-source.test from source. Original commit message: [llvm] Prevent duplicate files in debug line header in dwarf 5. Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D59515 llvm-svn: 357018
* [DAG] Avoid smart constructor-based dangling nodes.Nirav Dave2019-03-262-0/+15
| | | | | | | | | | | | | | | Various SelectionDAG non-combine operations (e.g. the getNode smart constructor and legalization) may leave dangling nodes by applying optimizations or not fully pruning unused result values. This can result in nodes that are never added to the worklist and therefore can not be pruned. Add a node inserter as the current node deleter to make sure such nodes have the chance of being pruned. Many minor changes, mostly positive. llvm-svn: 356996
* [TargetLowering] Add SimplifyDemandedBits support for ISD::INSERT_VECTOR_ELTSimon Pilgrim2019-03-262-3/+45
| | | | | | | | | | | | This helps us relax the extension of a lot of scalar elements before they are inserted into a vector. Its exposes an issue in DAGCombiner::convertBuildVecZextToZext as some/all the zero-extensions may be relaxed to ANY_EXTEND, so we need to handle that case to avoid a couple of AVX2 VPMOVZX test regressions. Once this is in it should be easier to fix a number of remaining failures to fold loads into VBROADCAST nodes. Differential Revision: https://reviews.llvm.org/D59484 llvm-svn: 356989
* Fix nondeterminism introduced in r353954Yi Kong2019-03-262-2/+3
| | | | | | | | | | DenseMap iteration order is not guaranteed, use MapVector instead. Fix provided by srhines. Differential Revision: https://reviews.llvm.org/D59807 llvm-svn: 356988
* Revert "[llvm] Prevent duplicate files in debug line header in dwarf 5."Ali Tamur2019-03-253-2/+1
| | | | | | | | This reverts commit 312ab05887d0e2caa29aaf843cefe39379a98d36. My commit broke the build; I will revert and find out what happened. llvm-svn: 356951
* [llvm] Prevent duplicate files in debug line header in dwarf 5.Ali Tamur2019-03-253-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. Reviewers: dblaikie, probinson, aprantl, espindola Reviewed By: probinson Subscribers: emaste, jvesely, nhaehnle, aprantl, javed.absar, arichardson, hiraditya, MaskRay, rupprecht, jdoerfert, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D59515 llvm-svn: 356941
* [SelectionDAG] Add icmp UNDEF handling to SelectionDAG::FoldSetCCSimon Pilgrim2019-03-251-3/+19
| | | | | | | | | | First half of PR40800, this patch adds DAG undef handling to icmp instructions to match the behaviour in llvm::ConstantFoldCompareInstruction and SimplifyICmpInst, this permits constant folding of vector comparisons where some elements had been reduced to UNDEF (by SimplifyDemandedVectorElts etc.). This involved a lot of tweaking to reduced tests as bugpoint loves to reduce icmp arguments to undef........ Differential Revision: https://reviews.llvm.org/D59363 llvm-svn: 356938
* [CGP] Build the DominatorTree lazilyTeresa Johnson2019-03-251-34/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In r355512 CGP was changed to build the DominatorTree only once per function traversal, to avoid repeatedly building it each time it was accessed. This solved one compile time issue but introduced another. In the second case, we now were building the DT unnecessarily many times when we performed many function traversals (i.e. more than once per function when running CGP because of changes made each time). Change to saving the DT in the CodeGenPrepare object, and building it lazily when needed. It is reset whenever we need to rebuild it. The case that exposed the issue there are 617 functions, and we walk them (i.e. execute the "while (MadeChange)" loop in runOnFunction) a total of 12083 times (so previously we were building the DT 12083 times). With this patch we only build the DT 844 times (average of 1.37 times per function). We dropped the total time to compile this file from 538.11s without this patch to 339.63s with it. There is still an issue as CGP is taking much longer than all other passes even with this patch, and before a recent compiler release cut at r355392 the total time to this compile was only 97 sec with a huge reduction in CGP time. I suspect that one of the other recent changes to CGP led to iterating each function many more times on average, but I need to do some more investigation. Reviewers: spatel Subscribers: jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59696 llvm-svn: 356937
OpenPOWER on IntegriCloud