summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Fix a bunch of typoes. NFCFangrui Song2018-03-302-3/+3
| | | | llvm-svn: 328907
* DataFlowSanitizer: wrappers of functions with local linkage should have the ↵Peter Collingbourne2018-03-301-1/+9
| | | | | | | | | | | | | | same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890
* Revert "peel loops with runtime small trip counts"Krzysztof Parzyszek2018-03-301-6/+1
| | | | | | This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875
* peel loops with runtime small trip countsIkhlas Ajbar2018-03-301-1/+6
| | | | | | | | | | For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854
* Fix some layering in StripNonLineTableDebugInfo, moving its declaration from ↵David Blaikie2018-03-291-1/+1
| | | | | | IPO.h to Utils.h to match its implementation llvm-svn: 328844
* Remove unused header to fix layering.David Blaikie2018-03-291-1/+0
| | | | llvm-svn: 328842
* Remove unused headers to fix layeringDavid Blaikie2018-03-291-2/+0
| | | | llvm-svn: 328840
* llvm-c: Split Utils out of Scalar.hDavid Blaikie2018-03-291-1/+1
| | | | | | | To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839
* Add msan custom mapping options.Evgeniy Stepanov2018-03-291-49/+82
| | | | | | | | | | | Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830
* [NFC][LICM] Rearrange checks to have the cheap bail out firstPhilip Reames2018-03-291-6/+6
| | | | llvm-svn: 328822
* [JumpThreading] Don't select an edge that we know we can't threadHaicheng Wu2018-03-291-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798
* [LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass ↵David Green2018-03-293-581/+619
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766
* [Transforms] Make sure to include the c binding header when defining c ↵Benjamin Kramer2018-03-291-0/+1
| | | | | | | | | | binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765
* Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header ↵David Blaikie2018-03-281-13/+9
| | | | | | | | dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737
* Oops - moved slightly too many things from Scalar to Utils. Move ↵David Blaikie2018-03-282-4/+4
| | | | | | LoopSimplifyCFG things back llvm-svn: 328720
* Transforms: Introduce Transforms/Utils.h rather than spreading the ↵David Blaikie2018-03-2830-29/+43
| | | | | | | | | declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717
* [MSan] Introduce ActualFnStart. NFCAlexander Potapenko2018-03-281-8/+10
| | | | | | | | | | | | | This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697
* [MSan] Add an isStore argument to getShadowOriginPtr(). NFCAlexander Potapenko2018-03-281-38/+47
| | | | | | | | | | | | | | | | This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692
* 80-line wrap. NFCXin Tong2018-03-271-1/+2
| | | | llvm-svn: 328660
* [PGO] Fix branch probability remarks assertRong Xu2018-03-271-7/+9
| | | | | | | | | Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653
* [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per targetKrzysztof Parzyszek2018-03-271-1/+2
| | | | | | | | The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632
* [LoopUnroll][NFC] Remove redundant canPeel checkMax Kazantsev2018-03-271-2/+2
| | | | | | | | | | | | We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615
* [IRCE] Enable decreasing loops of non-const boundSam Parker2018-03-271-52/+74
| | | | | | | | | | | | | | As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613
* [InstCombine] improve code comment; NFCSanjay Patel2018-03-261-2/+2
| | | | llvm-svn: 328560
* [InstCombine] reassociate loop invariant GEP chains to enable LICMSebastian Pop2018-03-261-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539
* [InstCombine] distribute fmul over fadd/fsubSanjay Patel2018-03-262-100/+15
| | | | | | | | | | This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502
* [InstCombine] check uses before creating instructions for fmul distributionSanjay Patel2018-03-261-1/+1
| | | | | | As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498
* [LSR] Allow giving priority to post-incrementing addressing modesKrzysztof Parzyszek2018-03-261-9/+66
| | | | | | | | | | | Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490
* [LoopUnroll] Fix dangling pointers in SCEVMax Kazantsev2018-03-261-28/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483
* [DeadArgElim] Strip allocsize attributes when deleting an argument.Benjamin Kramer2018-03-261-2/+6
| | | | | | | Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481
* [IRCE] Enable increasing loops of variable boundsSam Parker2018-03-261-58/+78
| | | | | | | | | | | | | | | | | | | | | CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480
* [PatternMatch] allow undef elements when matching vector FP +0.0Sanjay Patel2018-03-255-11/+11
| | | | | | | | | | | | | This continues the FP constant pattern matching improvements from: https://reviews.llvm.org/rL327627 https://reviews.llvm.org/rL327339 https://reviews.llvm.org/rL327307 Several integer constant matchers also have this ability. I'm separating matching of integer/pointer null from FP positive zero and renaming/commenting to make the functionality clearer. llvm-svn: 328461
* [InstCombine] peek through more icmp of FP cast + bitcastSanjay Patel2018-03-251-4/+14
| | | | | | This is an extension of rL328426 as noted in D44367. llvm-svn: 328448
* [InstCombine] peek through FP casts for sign-bit compares (PR36682)Sanjay Patel2018-03-241-0/+9
| | | | | | | | | | | | This pattern came up in PR36682: https://bugs.llvm.org/show_bug.cgi?id=36682 https://godbolt.org/g/LhuD9A Equality checks are planned as a follow-up enhancement. Differential Revision: https://reviews.llvm.org/D44367 llvm-svn: 328426
* [InstCombine] fix formatting; NFCSanjay Patel2018-03-241-37/+30
| | | | llvm-svn: 328425
* Remove unused header from EntryExitInstrumenterDavid Blaikie2018-03-241-1/+0
| | | | | | | Fixes layering, since Transforms/Utils doesn't depend on CodeGen, so shouldn't include headers from it. llvm-svn: 328399
* [GuardWidening] Group code by class [NFC]Philip Reames2018-03-231-39/+40
| | | | llvm-svn: 328387
* Fix Layering, move instrumentation transform headers into Instrumentation ↵David Blaikie2018-03-235-5/+5
| | | | | | subdirectory llvm-svn: 328379
* [PM][FunctionAttrs] add NoUnwind attribute inference to ↵Fedor Sergeev2018-03-231-32/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PostOrderFunctionAttrs pass Summary: This was motivated by absence of PrunEH functionality in new PM. It was decided that a proper way to do PruneEH is to add NoUnwind inference into PostOrderFunctionAttrs and then perform normal SimplifyCFG on top. This change generalizes attribute handling implemented for (a removal of) Convergent attribute, by introducing a generic builder-like class AttributeInferer It registers all the attribute inference requests, storing per-attribute predicates into a vector, and then goes through an SCC Node, scanning all the instructions for not breaking attribute assumptions. The main idea is that as soon all the instructions from all the functions of SCC Node conform to attribute assumptions then we are free to infer the attribute as set for all the functions of SCC Node. It handles two distinct cases of attributes: - those that might break due to derefinement of the function code for these attributes we are allowed to apply inference only if all the functions are "exact definitions". Example - NoUnwind. - those that do not care about derefinement for these attributes we are allowed to apply inference as soon as we see any function definition. Example - removal of Convergent attribute. Also in this commit: * Converted all the FunctionAttrs tests to use FileCheck and added new-PM invocations to them * FunctionAttrs/convergent.ll test demonstrates a difference in behavior between new and old PM implementations. Marked with FIXME. * PruneEH tests were converted to new-PM as well, using function-attrs+simplify-cfg combo as intended * some of "other" tests were updated since function-attrs now infers 'nounwind' even for old PM pipeline * -disable-nounwind-inference hidden option added as a possible workaround for a supposedly rare case when nounwind being inferred by default presents a problem Reviewers: chandlerc, jlebar Reviewed By: jlebar Subscribers: eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D44415 llvm-svn: 328377
* [InstCombine] simplify code for FP intrinsic shrinking; NFCISanjay Patel2018-03-231-10/+5
| | | | llvm-svn: 328372
* [HWASan] Port HWASan to Linux x86-64 (LLVM)Alex Shlyapnikov2018-03-231-13/+58
| | | | | | | | | | | | | | | | | | | | | Summary: Porting HWASan to Linux x86-64, first of the three patches, LLVM part. The approach is similar to ARM case, trap signal is used to communicate memory tag check failure. int3 instruction is used to generate a signal, access parameters are stored in nop [eax + offset] instruction immediately following the int3 one. One notable difference is that x86-64 has to untag the pointer before use due to the lack of feature comparable to ARM's TBI (Top Byte Ignore). Reviewers: eugenis Subscribers: kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D44699 llvm-svn: 328342
* Fix a block copying problem in LICMAndrew Kaylor2018-03-231-2/+8
| | | | | | Differential Revision: https://reviews.llvm.org/D44817 llvm-svn: 328336
* [InstCombine] reduce code duplication; NFCSanjay Patel2018-03-231-56/+49
| | | | llvm-svn: 328323
* [InstCombine] improve variable name; NFCSanjay Patel2018-03-231-12/+10
| | | | llvm-svn: 328322
* [SLP] Stop counting cost of gather sequences with multiple usesMatthew Simpson2018-03-231-1/+22
| | | | | | | | | | | | | | | When building the SLP tree, we look for reuse among the vectorized tree entries. However, each gather sequence is represented by a unique tree entry, even though the sequence may be identical to another one. This means, for example, that a gather sequence with two uses will be counted twice when computing the cost of the tree. We should only count the cost of the definition of a gather sequence rather than its uses. During code generation, the redundant gather sequences are emitted, but we optimize them away with CSE. So it looks like this problem just affects the cost model. Differential Revision: https://reviews.llvm.org/D44742 llvm-svn: 328316
* Revert r328307: [IPSCCP] Use constant range information for comparisons of ↵Florian Hahn2018-03-231-17/+50
| | | | | | | | parameters. Reverted for now, due to it causing verifier failures. llvm-svn: 328312
* [IPSCCP] Use constant range information for comparisons of parameters.Florian Hahn2018-03-231-50/+17
| | | | | | | | | | | | | | | | | | For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 328307
* [LoopUnroll] Simplify induction variables after peeling too.Florian Hahn2018-03-231-2/+3
| | | | | | | | | | | | | Loop peeling also has an impact on the induction variables, so we should benefit from induction variable simplification after peeling too. Reviewers: sanjoy, bogner, mzolotukhin, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D43878 llvm-svn: 328301
* Move SampleProfile.h into IPO along with the rest of the IPO pass headersDavid Blaikie2018-03-221-1/+1
| | | | llvm-svn: 328262
* Finish moving the IPSCCP pass from Scalar to IPO - moving the registrationDavid Blaikie2018-03-222-1/+1
| | | | llvm-svn: 328259
OpenPOWER on IntegriCloud