summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Add the ShadowCallStack attributeVlad Tsyrklevich2018-04-032-0/+2
| | | | | | | | | | | | | | | | | | Summary: Introduce the ShadowCallStack function attribute. It's added to functions compiled with -fsanitize=shadow-call-stack in order to mark functions to be instrumented by a ShadowCallStack pass to be submitted in a separate change. Reviewers: pcc, kcc, kubamracek Reviewed By: pcc, kcc Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc Differential Revision: https://reviews.llvm.org/D44800 llvm-svn: 329108
* [SLP] Fixed formatting, NFC.Alexey Bataev2018-04-031-1/+2
| | | | llvm-svn: 329091
* [InstCombine] Fold compare of int constant against a splatted vector of intsDaniel Neilson2018-04-032-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Folding patterns like: %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %ext = extractelement <4 x i8> %insvec, i32 0 %cond = icmp eq i32 %ext, 0 Combined with existing rules, this allows us to fold patterns like: %insvec = insertelement <4 x i8> undef, i8 %val, i32 0 %vec = shufflevector <4 x i8> %insvec, <4 x i8> undef, <4 x i32> zeroinitializer %cast = bitcast <4 x i8> %vec to i32 %cond = icmp eq i32 %cast, 0 into: %cond = icmp eq i8 %val, 0 When we construct a splat vector via a shuffle, and bitcast the vector into an integer type for comparison against an integer constant. Then we can simplify the the comparison to compare the splatted value against the integer constant. Reviewers: spatel, anna, mkazantsev Reviewed By: spatel Subscribers: efriedma, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D44997 llvm-svn: 329087
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-031-92/+228
| | | | | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Patch does not support reordering of the repeated instruction, this must be handled in the separate patch. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 329085
* Recommit "[SLP] Fix issues with debug output in the SLP vectorizer."Alexey Bataev2018-04-031-3/+4
| | | | | | | | | | | | | The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used errs() rather than dbgs(). llvm-svn: 329082
* Revert "[SLP] Fix PR36481: vectorize reassociated instructions."Benjamin Kramer2018-04-031-230/+95
| | | | | | This reverts commit r328980 and r329046. Makes the vectorizer crash. llvm-svn: 329071
* MSan: introduce the conservative assembly handling mode.Alexander Potapenko2018-04-031-1/+49
| | | | | | | | | | | | The default assembly handling mode may introduce false positives in the cases when MSan doesn't understand that the assembly call initializes the memory pointed to by one of its arguments. We introduce the conservative mode, which initializes the first |sizeof(type)| bytes for every |type*| pointer passed into the assembly statement. llvm-svn: 329054
* [SLP] Fix issues with debug output in the SLP vectorizer.Chandler Carruth2018-04-031-10/+10
| | | | | | | | | | | | | The primary issue here is that using NDEBUG alone isn't enough to guard debug printing -- instead the DEBUG() macro needs to be used so that the specific pass debug logging check is employed. Without this, every asserts-enabled build was printing out information when it hit this. I also fixed another place where we had multiple statements in a DEBUG macro to use {}s to be a bit cleaner. And I fixed a place that used `errs()` rather than `dbgs()`. llvm-svn: 329046
* peel loops with runtime small trip countsIkhlas Ajbar2018-04-032-2/+10
| | | | | | | | | | For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 329042
* [SLP] Distinguish "demanded and shrinkable" from "demanded and not ↵Haicheng Wu2018-04-031-17/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | shrinkable" values when determining the minimum bitwidth We use two approaches for determining the minimum bitwidth. * Demanded bits * Value tracking If demanded bits doesn't result in a narrower type, we then try value tracking. We need this if we want to root SLP trees with the indices of getelementptr instructions since all the bits of the indices are demanded. But there is a missing piece though. We need to be able to distinguish "demanded and shrinkable" from "demanded and not shrinkable". For example, the bits of %i in %i = sext i32 %e1 to i64 %gep = getelementptr inbounds i64, i64* %p, i64 %i are demanded, but we can shrink %i's type to i32 because it won't change the result of the getelementptr. On the other hand, in %tmp15 = sext i32 %tmp14 to i64 %tmp16 = insertvalue { i64, i64 } undef, i64 %tmp15, 0 it doesn't make sense to shrink %tmp15 and we can skip the value tracking. Ideas are from Matthew Simpson! Differential Revision: https://reviews.llvm.org/D44868 llvm-svn: 329035
* [Coroutines] Avoid assert splitting hidden corosBrian Gesiak2018-04-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: When attempting to split a coroutine with 'hidden' visibility (for example, a C++ coroutine that is inlined when compiled with the option '-fvisibility-inlines-hidden'), LLVM would hit an assertion in include/llvm/IR/GlobalValue.h:240: "local linkage requires default visibility". The issue is that the visibility is copied from the source of the function split in the `CloneFunctionInto` function, but the linkage is not. To fix, create the new function first with external linkage, then copy the linkage from the original function *after* `CloneFunctionInto` is called. Since `GlobalValue::setLinkage` in turn calls `maybeSetDsoLocal`, the explicit call to `setDSOLocal` can be removed in CoroSplit.cpp. Test Plan: check-llvm Reviewers: GorNishanov, lewissbaker, EricWF, majnemer, rnk Reviewed By: rnk Subscribers: llvm-commits, eric_niebler Differential Revision: https://reviews.llvm.org/D44185 llvm-svn: 329033
* [InstCombine] Don't strip function type casts from musttail callsReid Kleckner2018-04-021-1/+10
| | | | | | | | | | | | | | | Summary: The cast simplifications that instcombine does here do not make any attempt to obey the verifier rules for musttail calls. Therefore we have to disable them. Reviewers: efriedma, majnemer, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D45186 llvm-svn: 329027
* Treat inlining a notail call as a regular, non-tail callReid Kleckner2018-04-021-0/+6
| | | | | | | | | Otherwise, we end up inlining a musttail call into a non-tail position, which breaks verifier invariants. Fixes PR31014 llvm-svn: 329015
* [InstCombine] add folds for icmp + sub (PR36969)Sanjay Patel2018-04-021-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | (A - B) >u A --> A <u B C <u (C - D) --> C <u D https://rise4fun.com/Alive/e7j Name: ugt %sub = sub i8 %x, %y %cmp = icmp ugt i8 %sub, %x => %cmp = icmp ult i8 %x, %y Name: ult %sub = sub i8 %x, %y %cmp = icmp ult i8 %x, %sub => %cmp = icmp ult i8 %x, %y This should fix: https://bugs.llvm.org/show_bug.cgi?id=36969 llvm-svn: 329011
* [DeadArgumentElim] Clone function level metadatasRong Xu2018-04-021-5/+11
| | | | | | | | | | | | Some Function level metadatas, such as function entry count, are not cloned in DeadArgumentElim. This happens a lot in lto/thinlto because of DeadArgumentElim after internalization. This patch clones the metadatas in the original function to the new function. Differential Revision: https://reviews.llvm.org/D44127 llvm-svn: 328991
* [coroutines] Add support for llvm.coro.noop intrinsicsGor Nishanov2018-04-022-7/+48
| | | | | | | | | | | | | | | | Summary: A recent addition to Coroutines TS (https://wg21.link/p0913) adds a pre-defined coroutine noop_coroutine that does nothing. To implement this feature, we implemented an llvm.coro.noop intrinsic that returns a coroutine handle to a coroutine that does nothing when resumed or destroyed. Reviewers: EricWF, modocache, rnk, lewissbaker Reviewed By: modocache Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D45114 llvm-svn: 328986
* [SLP] Fix PR36481: vectorize reassociated instructions.Alexey Bataev2018-04-021-92/+227
| | | | | | | | | | | | | | | Summary: If the load/extractelement/extractvalue instructions are not originally consecutive, the SLP vectorizer is unable to vectorize them. Patch allows reordering of such instructions. Reviewers: RKSimon, spatel, hfinkel, mkuper, Ayal, ashahid Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43776 llvm-svn: 328980
* [ThinLTO] Add an import cutoff for debugging/triagingTeresa Johnson2018-04-011-0/+13
| | | | | | | | | | | | | | Summary: Adds -import-cutoff=N which will stop importing during the thin link after N imports. Default is -1 (no limit). Reviewers: wmi Subscribers: inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D45127 llvm-svn: 328934
* [LoopRotate] Rotate loops with loop exiting latchesDavid Green2018-04-011-2/+24
| | | | | | | | | | | If a loop has a loop exiting latch, it can be profitable to rotate the loop if it leads to the simplification of a phi node. Perform rotation in these cases even if loop rotate itself didnt simplify the loop to get there. Differential Revision: https://reviews.llvm.org/D44199 llvm-svn: 328933
* Fix a bunch of typoes. NFCFangrui Song2018-03-302-3/+3
| | | | llvm-svn: 328907
* DataFlowSanitizer: wrappers of functions with local linkage should have the ↵Peter Collingbourne2018-03-301-1/+9
| | | | | | | | | | | | | | same linkage as the function being wrapped This patch resolves link errors when the address of a static function is taken, and that function is uninstrumented by DFSan. This change resolves bug 36314. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D44784 llvm-svn: 328890
* Revert "peel loops with runtime small trip counts"Krzysztof Parzyszek2018-03-301-6/+1
| | | | | | This reverts commit r328854, it breaks some Hexagon tests. llvm-svn: 328875
* peel loops with runtime small trip countsIkhlas Ajbar2018-03-301-1/+6
| | | | | | | | | | For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 llvm-svn: 328854
* Fix some layering in StripNonLineTableDebugInfo, moving its declaration from ↵David Blaikie2018-03-291-1/+1
| | | | | | IPO.h to Utils.h to match its implementation llvm-svn: 328844
* Remove unused header to fix layering.David Blaikie2018-03-291-1/+0
| | | | llvm-svn: 328842
* Remove unused headers to fix layeringDavid Blaikie2018-03-291-2/+0
| | | | llvm-svn: 328840
* llvm-c: Split Utils out of Scalar.hDavid Blaikie2018-03-291-1/+1
| | | | | | | To fix layering (so that Scalar.h, a libScalarOpts header, isn't included from Utils - which libScalarOpts depends on). llvm-svn: 328839
* Add msan custom mapping options.Evgeniy Stepanov2018-03-291-49/+82
| | | | | | | | | | | Similarly to https://reviews.llvm.org/D18865 this adds options to provide custom mapping for msan. As discussed in http://lists.llvm.org/pipermail/llvm-dev/2018-February/121339.html Patch by vit9696(at)avp.su. Differential Revision: https://reviews.llvm.org/D44926 llvm-svn: 328830
* [NFC][LICM] Rearrange checks to have the cheap bail out firstPhilip Reames2018-03-291-6/+6
| | | | llvm-svn: 328822
* [JumpThreading] Don't select an edge that we know we can't threadHaicheng Wu2018-03-291-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In r312664 (D36404), JumpThreading stopped threading edges into loop headers. Unfortunately, I observed a significant performance regression as a result of this change. Upon further investigation, the problematic pattern looked something like this (after many high level optimizations): while (true) { bool cond = ...; if (!cond) { <body> } if (cond) break; } Now, naturally we want jump threading to essentially eliminate the second if check and hook up the edges appropriately. However, the above mentioned change, prevented it from doing this because it would have to thread an edge into the loop header. Upon further investigation, what is happening is that since both branches are threadable, JumpThreading picks one of them at arbitrarily. In my case, because of the way that the IR ended up, it tended to pick the one to the loop header, bailing out immediately after. However, if it had picked the one to the exit block, everything would have worked out fine (because the only remaining branch would then be folded, not thraded which is acceptable). Thus, to fix this problem, we can simply eliminate loop headers from consideration as possible threading targets earlier, to make sure that if there are multiple eligible branches, we can still thread one of the ones that don't target a loop header. Patch by Keno Fischer! Differential Revision: https://reviews.llvm.org/D42260 llvm-svn: 328798
* [LoopRotate] Restructuring LoopRotation.cpp to create Loop Rotation Pass ↵David Green2018-03-293-581/+619
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with Loop Rotation Utility Interface The existing LoopRotation.cpp is implemented as one of loop passes instead of being a utility. The user cannot easily perform the loop rotation selectively (or on demand) under different optimization level. For example, the loop rotation is needed as part of the logic to convert a loop into a loop with bottom test for a transformation. If the loop rotation is simply added as a loop pass before the transformation, the pass is skipped if it is compiled at –O0 or if it is explicitly disabled by the user, causing the compiler to generate incorrect code. Furthermore, as a loop pass it will rotate all loops instead of just the relevant loops. We provide a utility interface for the loop rotation so that the loop rotation can be called on demand. The changeset is as follows: - Create a new file lib/Transforms/Utils/LoopRotationUtils.cpp and move the main implementation of class LoopRotate into this file. - Create a new file llvm/include/Transform/Utils/LoopRotationUtils.h with the interface LoopRotation(...). - Original LoopRotation.cpp is changed to use the utility function LoopRotation in LoopRotationUtils.cpp. This is done in the same way community did for mem-to-reg implementation. Patch by Jin Lin! Differential Revision: https://reviews.llvm.org/D44595 llvm-svn: 328766
* [Transforms] Make sure to include the c binding header when defining c ↵Benjamin Kramer2018-03-291-0/+1
| | | | | | | | | | binding functions Otherwise the definitions can't see the extern C declarations and get name mangled, making it impossible for users to call them. This breaks the Go bindings. llvm-svn: 328765
* Plumb useAA through TargetTransformInfo to remove Transforms->CodeGen header ↵David Blaikie2018-03-281-13/+9
| | | | | | | | dependency Thanks to echristo for the pointers on direction. llvm-svn: 328737
* Oops - moved slightly too many things from Scalar to Utils. Move ↵David Blaikie2018-03-282-4/+4
| | | | | | LoopSimplifyCFG things back llvm-svn: 328720
* Transforms: Introduce Transforms/Utils.h rather than spreading the ↵David Blaikie2018-03-2830-29/+43
| | | | | | | | | declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. llvm-svn: 328717
* [MSan] Introduce ActualFnStart. NFCAlexander Potapenko2018-03-281-8/+10
| | | | | | | | | | | | | This is a step towards the upcoming KMSAN implementation patch. KMSAN is going to prepend a special basic block containing tool-specific calls to each function. Because we still want to instrument the original entry block, we'll need to store it in ActualFnStart. For MSan this will still be F.getEntryBlock(), whereas for KMSAN it'll contain the second BB. llvm-svn: 328697
* [MSan] Add an isStore argument to getShadowOriginPtr(). NFCAlexander Potapenko2018-03-281-38/+47
| | | | | | | | | | | | | | | | This is a step towards the upcoming KMSAN implementation patch. The isStore argument is to be used by getShadowOriginPtrKernel(), it is ignored by getShadowOriginPtrUserspace(). Depending on whether a memory access is a load or a store, KMSAN instruments it with different functions, __msan_metadata_ptr_for_load_X() and __msan_metadata_ptr_for_store_X(). Those functions may return different values for a single address, which is necessary in the case the runtime library decides to ignore particular accesses. llvm-svn: 328692
* 80-line wrap. NFCXin Tong2018-03-271-1/+2
| | | | llvm-svn: 328660
* [PGO] Fix branch probability remarks assertRong Xu2018-03-271-7/+9
| | | | | | | | | Fixed counter/weight overflow that leads to an assertion. Also fixed the help string for pgo-emit-branch-prob option. Differential Revision: https://reviews.llvm.org/D44809 llvm-svn: 328653
* [LV] Add TTI::shouldMaximizeVectorBandwidth to allow enabling it per targetKrzysztof Parzyszek2018-03-271-1/+2
| | | | | | | | The default implementation returns false and keeps the current behavior. Differential Revision: https://reviews.llvm.org/D44735 llvm-svn: 328632
* [LoopUnroll][NFC] Remove redundant canPeel checkMax Kazantsev2018-03-271-2/+2
| | | | | | | | | | | | We check `canPeel` twice: when evaluating the number of iterations to be peeled and within the method `peelLoop` that performs peeling. This method is only executed if the calculated peel count is positive. Thus, the check in `peelLoop` can never fail. This patch replaces this check with an assert. Differential Revision: https://reviews.llvm.org/D44919 Reviewed By: fhahn llvm-svn: 328615
* [IRCE] Enable decreasing loops of non-const boundSam Parker2018-03-271-52/+74
| | | | | | | | | | | | | | As a follow-up to r328480, this updates the logic for the decreasing safety checks in a similar manner: - CanBeMax is replaced by CannotBeMaxInLoop which queries isLoopEntryGuardedByCond on the maximum value. - SumCanReachMin is replaced by isSafeDecreasingBound which includes some logic from parseLoopStructure and, again, has been updated to use isLoopEntryGuardedByCond on the given bounds. Differential Revision: https://reviews.llvm.org/D44776 llvm-svn: 328613
* [InstCombine] improve code comment; NFCSanjay Patel2018-03-261-2/+2
| | | | llvm-svn: 328560
* [InstCombine] reassociate loop invariant GEP chains to enable LICMSebastian Pop2018-03-261-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | This change brings performance of zlib up by 10%. The example below is from a hot loop in longest_match() from zlib. do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1 In this example %idx.ext1 is a loop invariant. It will be moved above the use of loop induction variable %idx.ext such that it can be hoisted out of the loop by LICM. The operands that have dependences carried by the loop will be sinked down in the GEP chain. This patch will produce the following output: do.body: %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ] %idx.ext = zext i32 %cur_match.addr.0 to i64 %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1 %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1 %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext llvm-svn: 328539
* [InstCombine] distribute fmul over fadd/fsubSanjay Patel2018-03-262-100/+15
| | | | | | | | | | This replaces a large chunk of code that was looking for compound patterns that include these sub-patterns. Existing tests ensure that all of the previous examples are still folded as expected. We still need to loosen the FMF check. llvm-svn: 328502
* [InstCombine] check uses before creating instructions for fmul distributionSanjay Patel2018-03-261-1/+1
| | | | | | As the tests show, we could create extra instructions without any obvious benefit. llvm-svn: 328498
* [LSR] Allow giving priority to post-incrementing addressing modesKrzysztof Parzyszek2018-03-261-9/+66
| | | | | | | | | | | Implement TTI interface for targets to indicate that the LSR should give priority to post-incrementing addressing modes. Combination of patches by Sebastian Pop and Brendon Cahoon. Differential Revision: https://reviews.llvm.org/D44758 llvm-svn: 328490
* [LoopUnroll] Fix dangling pointers in SCEVMax Kazantsev2018-03-261-28/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current logic of loop SCEV invalidation in Loop Unroller implicitly relies on fact that exit count of outer loops cannot rely on exiting blocks of inner loops, which is true in current implementation of backedge taken count calculation but is wrong in general. As result, when we only forget the loop that we have just unrolled, we may still have cached data for its outer loops (in particular, exit counts) which keeps references on blocks of inner loop that could have been changed or even deleted. The attached test demonstrates a situaton when after unrolling of innermost loop the outermost loop contains a dangling pointer on non-existant block. The problem shows up when we apply patch https://reviews.llvm.org/D44677 that makes SCEV smarter about exit count calculation. I am not sure if the bug exists without this patch, it appears that now it is accidentally correct just because in practice exact backedge taken count for outer loops with complex control flow inside is never calculated. But when SCEV learns to do so, this problem shows up. This patch replaces existing logic of SCEV loop invalidation with a correct one, which happens to be invalidation of outermost loop (which also leads to invalidation of all loops inside of it). It is the only way to ensure that no outer loop keeps dangling pointers on removed blocks, or just outdated information that has changed after unrolling. Differential Revision: https://reviews.llvm.org/D44818 Reviewed By: samparker llvm-svn: 328483
* [DeadArgElim] Strip allocsize attributes when deleting an argument.Benjamin Kramer2018-03-261-2/+6
| | | | | | | Since allocsize refers to the argument number it gets invalidated when an argument is removed and the numbers shift. llvm-svn: 328481
* [IRCE] Enable increasing loops of variable boundsSam Parker2018-03-261-58/+78
| | | | | | | | | | | | | | | | | | | | | CanBeMin is currently used which will report true for any unknown values, but often a check is performed outside the loop which covers this situation: for (int i = 0; i < N; ++i) ... if (N > 0) for (int i = 0; i < N; ++i) ... So I've add 'LoopGuardedAgainstMin' which reports whether N is greater than the minimum value which then allows loop with a variable loop count to be optimised. I've also moved the increasing bound checking into its own function and replaced SumCanReachMax is another isLoopEntryGuardedByCond function. llvm-svn: 328480
OpenPOWER on IntegriCloud