summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* LowerTypeTests: Split the pass in two: a resolution phase and a lowering phase.Peter Collingbourne2017-01-061-110/+155
| | | | | | | | | | | | This change separates how type identifiers are resolved from how intrinsic calls are lowered. All information required to lower an intrinsic call is stored in a new TypeIdLowering data structure. The idea is that this data structure can either be initialized using the module itself during regular LTO, or using the module summary in ThinLTO backends. Differential Revision: https://reviews.llvm.org/D28341 llvm-svn: 291205
* Fix typo. NFCXin Tong2017-01-051-1/+1
| | | | llvm-svn: 291178
* ThinLTO: add early "dead-stripping" on the IndexTeresa Johnson2017-01-051-5/+98
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Using the linker-supplied list of "preserved" symbols, we can compute the list of "dead" symbols, i.e. the one that are not reachable from a "preserved" symbol transitively on the reference graph. Right now we are using this information to mark these functions as non-eligible for import. The impact is two folds: - Reduction of compile time: we don't import these functions anywhere or import the function these symbols are calling. - The limited number of import/export leads to better internalization. Patch originally by Mehdi Amini. Reviewers: mehdi_amini, pcc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23488 llvm-svn: 291177
* [LICM] Allow promotion of some stores that are not guaranteed to execute.Michael Kuperstein2017-01-051-4/+24
| | | | | | | | | | | | | Promotion is always legal when a store within the loop is guaranteed to execute. However, this is not a necessary condition - for promotion to be memory model semantics-preserving, it is enough to have a store that dominates every exit block. This is because if the store dominates every exit block, the fact the exit block was executed implies the original store was executed as well. Differential Revision: https://reviews.llvm.org/D28147 llvm-svn: 291171
* [LICM] Small update to note changes made in hoistRegionAndrew Kaylor2017-01-051-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D28363 llvm-svn: 291157
* Remove a unnecessary hasLoopInvariantOperands check in loop sink.Xin Tong2017-01-051-2/+1
| | | | | | | | | | | | | | | | Summary: Preheader instruction's operands will always be invariant w.r.t. the loop which its the preheader for. Memory aliases are handled in canSinkOrHoistInst. Reviewers: danielcdh, davidxl Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D28270 llvm-svn: 291132
* [ThinLTO] Add parenthesis as per build warningTeresa Johnson2017-01-051-3/+2
| | | | | | Fixes a warning about "||" and "&&" due to r291108. llvm-svn: 291119
* [ThinLTO] Subsume all importing checks into a single flagTeresa Johnson2017-01-052-79/+20
| | | | | | | | | | | | | | | | | | | Summary: This adds a new summary flag NotEligibleToImport that subsumes several existing flags (NoRename, HasInlineAsmMaybeReferencingInternal and IsNotViableToInline). It also subsumes the checking of references on the summary that was being done during the thin link by eligibleForImport() for each candidate. It is much more efficient to do that checking once during the per-module summary build and record it in the summary. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28169 llvm-svn: 291108
* Currently isLikelyComplexAddressComputation tries to figure out if the given ↵Mohammed Agabaria2017-01-051-42/+17
| | | | | | | | | | | | | stride seems to be 'complex' and need some extra cost for address computation handling. This code seems to be target dependent which may not be the same for all targets. Passed the decision whether the given stride is complex or not to the target by sending stride information via SCEV to getAddressComputationCost instead of 'IsComplex'. Specifically at X86 targets we dont see any significant address computation cost in case of the strided access in general. Differential Revision: https://reviews.llvm.org/D27518 llvm-svn: 291106
* IR: Module summary representation for type identifiers; summary test ↵Peter Collingbourne2017-01-051-0/+51
| | | | | | | | | | | | | | | scaffolding for lowertypetests. Set up basic YAML I/O support for module summaries, plumb the summary into the pass and add a few command line flags to test YAML I/O support. Bitcode support to come separately, as will the code in LowerTypeTests that actually uses the summary. Also add a couple of tests that pass by virtue of the pass doing nothing with the summary (which happens to be the correct thing to do for those tests). Differential Revision: https://reviews.llvm.org/D28041 llvm-svn: 291069
* [DWARF] Null out the debug locs of load instructions that have been moved by ↵Wolfgang Pieb2017-01-041-2/+12
| | | | | | | | | | | GVN performing partial redundancy elimination (PRE). Not doing so can cause jumpy line tables and confusing (though correct) source attributions. Differential Revision: https://reviews.llvm.org/D27857 llvm-svn: 291037
* Use lazy-loading of Metadata in MetadataLoader when importing is enabled (NFC)Mehdi Amini2017-01-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a relatively simple scheme: we use the index emitted in the bitcode to avoid loading all the global metadata. Instead we load the index with their position in the bitcode so that we can load each of them individually. Materializing the global metadata block in this condition only triggers loading the named metadata, and the ones referenced from there (transitively). When materializing a function, metadata from the global block are loaded lazily as they are referenced. Two main current limitations are: 1) Global values other than functions are not materialized on demand, so we need to eagerly load METADATA_GLOBAL_DECL_ATTACHMENT records (and their transitive dependencies). 2) When we load a single metadata, we don't recurse on the operands, instead we use a placeholder or a temporary metadata. Unfortunately tepmorary nodes are very expensive. This is why we don't have it always enabled and only for importing. These two limitations can be lifted in a subsequent improvement if needed. With this change, the total link time of opt with ThinLTO and Debug Info enabled is going down from 282s to 224s (~20%). Reviewers: pcc, tejohnson, dexonsmith Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28113 llvm-svn: 291027
* InstCombine: Fold cos(-x) -> cos(x)Matt Arsenault2017-01-041-0/+14
| | | | | | Also cos(fabs(x)) -> cos(x) llvm-svn: 291022
* NewGVN: Track the maximum number of iterations GVN takes on any function, so ↵Daniel Berlin2017-01-041-1/+4
| | | | | | we can pinpoint performance issues. llvm-svn: 291002
* Reapply "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of ↵Robert Lougher2017-01-041-1/+9
| | | | | | | | | | | | | | | | | | | | common inst" This reapplies r289828 (reverted in r289833 as it broke the address sanitizer). The debugloc is now only set when the instruction is not a call, as this causes the verifier to assert (the inliner requires an inlinable callsite to have a debug loc if the caller and callee have debug info). Original commit message: Simplify CFG will try to sink the last instruction in a series of basic blocks, creating a "common" instruction in the successor block (sinkLastInstruction). When it does this, the debug location of the single instruction should be the merged debug locations of the commoned instructions. Original review: https://reviews.llvm.org/D27590 llvm-svn: 290973
* [InstCombine] Move casts around shift operationsDavid Majnemer2017-01-041-0/+19
| | | | | | | It is possible to perform a left shift before zero extending if the shift would only shift out zeros. llvm-svn: 290928
* [InstCombine] Combine adds across a zextDavid Majnemer2017-01-041-0/+12
| | | | | | | | | We can perform the following: (add (zext (add nuw X, C1)), C2) -> (zext (add nuw X, C1+C2)) This is only possible if C2 is negative and C2 is greater than or equal to negative C1. llvm-svn: 290927
* InstCombine: Fold fabs on select of constantsMatt Arsenault2017-01-031-0/+12
| | | | llvm-svn: 290913
* [InstCombine] use 'match' to reduce code bloat; NFCISanjay Patel2017-01-031-15/+11
| | | | | | | | | | | I wrote this patch before seeing the comment in: https://reviews.llvm.org/D27114 ...that suggests we should actually be canonicalizing the other way. So just in case we decide this is the right way, we might as well have a cleaner implementation. llvm-svn: 290912
* InstCombine: Add fma with constant transformsMatt Arsenault2017-01-031-3/+17
| | | | | | DAGCombine already does these. llvm-svn: 290860
* InstCombine: Add fma + fabs/fneg transformsMatt Arsenault2017-01-031-0/+30
| | | | | | | fma (fneg x), (fneg y), z -> fma x, y, z fma (fabs x), (fabs x), z -> fma x, x, z llvm-svn: 290859
* [EarlyCSE] less else, more auto; NFCSanjay Patel2017-01-031-2/+2
| | | | llvm-svn: 290848
* [InstCombine] use combineMetadataForCSE instead of copying it; NFCISanjay Patel2017-01-021-14/+4
| | | | llvm-svn: 290844
* Make sure total loop body weight is preserved in loop peelingXin Tong2017-01-021-8/+17
| | | | | | | | | | | | | | | Summary: Regardless how the loop body weight is distributed, we should preserve total loop body weight. i.e. we should have same weight reaching the body of the loop or its duplicates in peeled and unpeeled case. Reviewers: mkuper, davidxl, anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28179 llvm-svn: 290833
* NewGVN: Clean up after removing possibility of null expressions.Daniel Berlin2017-01-021-17/+14
| | | | llvm-svn: 290828
* fix typo; NFCSanjay Patel2017-01-021-1/+1
| | | | llvm-svn: 290827
* [NewGVN] Fold single-use variable inside the assertion.Davide Italiano2017-01-021-5/+3
| | | | | | | It placates some bots which complain because they compile the assertion out and think the variable is unused. llvm-svn: 290825
* [NewGVN] Restore old code to placate buildbots.Davide Italiano2017-01-021-2/+6
| | | | | | | | | Apparently my suggestion of using ternary doesn't really work as clang complains about incompatible types on LHS and RHS. Some GCC versions happen to accept the code but clang behaviour is correct here. llvm-svn: 290822
* NewGVN: Fix some formatting and comment issuesDaniel Berlin2017-01-021-18/+8
| | | | llvm-svn: 290820
* NewGVN: Add UnknownExpression and create them for things we can't symbolize. ↵Daniel Berlin2017-01-021-19/+19
| | | | | | | | | | | | | | | | | Kill fragile machinery for handling null expressions. Summary: This avoids the very fragile code for null expressions. We could also use a denseset that tracks which things have null expressions instead, but that seems pretty fragile and premature optimization. This resolves a number of infinite loop cases, test reductions coming. Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28193 llvm-svn: 290816
* NewGVN: Fix PR31480, PR31483, PR31499, by rewriting how memory congruence ↵Daniel Berlin2017-01-021-20/+144
| | | | | | | | | | | | | | handling works. Summary: Previously, we tried to fix up the equivalences during symbolic evaluation. This does not work. Now, we change the equivalences during congruence finding, where it belongs. We also initialize the equivalence table to give a maximal answer. Reviewers: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28192 llvm-svn: 290815
* [PMBuilder] Remove RunFloat2Int cl::opt.Davide Italiano2017-01-021-6/+1
| | | | | | The pass has been on by default for a long time without problems. llvm-svn: 290814
* [Inliner] remove unnecessary null checks from AddAlignmentAssumptions(); NFCISanjay Patel2016-12-311-5/+3
| | | | | | | We bail out on the 1st line if the assumption cache is not set, so there's no need to check it after that. llvm-svn: 290787
* [InstCombine][AVX-512] Teach InstCombine that llvm.x86.avx512.vcomi.sd and ↵Craig Topper2016-12-311-0/+2
| | | | | | | | llvm.x86.avx512.vcomi.ss don't use the upper elements of their input. This was already done for the SSE/SSE2 version of the intrinsics. llvm-svn: 290776
* [InstCombine][AVX-512] When turning intrinsics with masking into native IR, ↵Craig Topper2016-12-301-9/+20
| | | | | | | | don't emit a select if the mask is known to be all ones. This saves InstCombine the burden of having to optimize the select later. llvm-svn: 290774
* Add a comment for a todo in LoopUnroll post cleanupPhilip Reames2016-12-301-0/+5
| | | | llvm-svn: 290769
* [CVP] Adjust iteration order to reduce the amount of work requiredPhilip Reames2016-12-301-3/+8
| | | | | | | | CVP doesn't care about the order of blocks visited, but by using a pre-order traversal over the graph we can a) not visit unreachable blocks and b) optimize as we go so that analysis of later blocks produce slightly more precise results. I noticed this via inspection and don't have a concrete example which points to the issue. llvm-svn: 290760
* [NewGVN] Remove unneeded newline from assertion message.Davide Italiano2016-12-301-1/+1
| | | | llvm-svn: 290755
* [InstCombine] Address post-commit feedbackDavid Majnemer2016-12-302-2/+4
| | | | llvm-svn: 290741
* [LICM] When promoting scalars, allow inserting stores to thread-local allocas.Michael Kuperstein2016-12-301-1/+2
| | | | | | | | | This is similar to the allocfn case - if an alloca is not captured, then it's necessarily thread-local. Differential Revision: https://reviews.llvm.org/D28170 llvm-svn: 290738
* Use continuous boosting factor for complete unroll.Dehao Chen2016-12-301-75/+32
| | | | | | | | | | | | | | | | | | | | Summary: The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously. The patch also simplified the code by reducing one parameter in UP. The patch only affects code-gen of two speccpu2006 benchmark: 445.gobmk binary size decreases 0.08%, no performance change. 464.h264ref binary size increases 0.24%, no performance change. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26989 llvm-svn: 290737
* [LICM] Remove unneeded tracking of whether changes were made. NFC.Michael Kuperstein2016-12-301-9/+7
| | | | | | | | "Changed" doesn't actually change within the loop, so there's no reason to keep track of it - we always return false during analysis and true after the transformation is made. llvm-svn: 290735
* [LICM] Make logic in promoteLoopAccessesToScalars easier to follow. NFC.Michael Kuperstein2016-12-301-40/+47
| | | | llvm-svn: 290734
* [InstCombine] More thoroughly canonicalize the position of zextsDavid Majnemer2016-12-302-9/+120
| | | | | | | | We correctly canonicalized (add (sext x), (sext y)) to (sext (add x, y)) where possible. However, we didn't perform the same canonicalization for zexts or for muls. llvm-svn: 290733
* [LICM] Compute exit blocks for promotion eagerly. NFC.Michael Kuperstein2016-12-291-35/+36
| | | | | | | | | | | This moves the exit block and insertion point computation to be eager, instead of after seeing the first scalar we can promote. The cost is relatively small (the computation happens anyway, see discussion on D28147), and the code is easier to follow, and can bail out earlier if there's a catchswitch present. llvm-svn: 290729
* [LICM] Don't try to promote in loops where we have no chance to promote. NFC.Michael Kuperstein2016-12-291-10/+6
| | | | | | | | | | We would check whether we have a prehader *or* dedicated exit blocks, and go into the promotion loop. Then, for each alias set we'd check if we have a preheader *and* dedicated exit blocks, and bail if not. Instead, bail immediately if we don't have both. llvm-svn: 290728
* [LICM] Only recompute LCSSA when we actually promoted something.Michael Kuperstein2016-12-291-3/+6
| | | | | | | | | | | | We want to recompute LCSSA only when we actually promoted a value. This means we only need to look at changes made by promotion when deciding whether to recompute it or not, not at regular sinking/hoisting. (This was what the code was documented as doing, just not what it did) Hopefully NFC. llvm-svn: 290726
* NewGVN: Fix PR 31491 by ensuring that we touch the right instructions. ↵Daniel Berlin2016-12-291-11/+21
| | | | | | Change to one based numbering so we can assert we don't cause the same bug again. llvm-svn: 290724
* [InstCombine] Use getVectorNumElements instead of explicitly casting to ↵Craig Topper2016-12-291-8/+7
| | | | | | VectorType and calling getNumElements. NFC llvm-svn: 290707
* [InstCombine] Fix typo in comment. NFCCraig Topper2016-12-291-1/+1
| | | | llvm-svn: 290706
OpenPOWER on IntegriCloud