summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [LV] fold-tail predication should be respected even with assume_safetyDorit Nuzman2019-08-152-5/+5
| | | | | | | | | | | | | | | assume_safety implies that loads under "if's" can be safely executed speculatively (unguarded, unmasked). However this assumption holds only for the original user "if's", not those introduced by the compiler, such as the fold-tail "if" that guards us from loading beyond the original loop trip-count. Currently the combination of fold-tail and assume-safety pragmas results in ignoring the fold-tail predicate that guards the loads, generating unmasked loads. This patch fixes this behavior. Differential Revision: https://reviews.llvm.org/D66106 Reviewers: Ayal, hsaito, fhahn llvm-svn: 368973
* [coroutine] Fixes "cannot move instruction since its users are not dominated ↵Gor Nishanov2019-08-152-148/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | by CoroBegin" problem. Summary: Fixes https://bugs.llvm.org/show_bug.cgi?id=36578 and https://bugs.llvm.org/show_bug.cgi?id=36296. Supersedes: https://reviews.llvm.org/D55966 One of the fundamental transformation that CoroSplit pass performs before splitting the coroutine is to find which values need to survive between suspend and resume and provide a slot for them in the coroutine frame to spill and restore the value as needed. Coroutine frame becomes available once the storage for it was allocated and that point is marked in the pre-split coroutine with a llvm.coro.begin intrinsic. FE normally puts all of the user-authored code that would be accessing those values after llvm.coro.begin, however, sometimes instructions accessing those values would end up prior to coro.begin. For example, writing out a value of the parameter into the alloca done by the FE or instructions that are added by the optimization passes such as SROA when it rewrites allocas. Prior to this change, CoroSplit pass would try to move instructions that may end up accessing the values in the coroutine frame after CoroBegin. However it would run into problems (report_fatal_error) if some of the values would be used both in the allocation function (for example allocator is passed as a parameter to a coroutine) and in the use-authored body of the coroutine. To handle this case and to simplify the instruction moving logic, this change removes all of the instruction moving. Instead, we only change the uses of the spilled values that are dominated by coro.begin and leave other instructions intact. Before: ``` %var = alloca i32 %1 = getelementptr .. %var; ; will move this one after coro.begin %f = call i8* @llvm.coro.begin( ``` After: ``` %var = alloca i32 %1 = getelementptr .. %var; stays put %f = call i8* @llvm.coro.begin( ``` If we discover that there is a potential write into an alloca, prior to coro.begin we would copy its value from the alloca into the spill slot in the coroutine frame. Before: ``` %var = alloca i32 store .. %var ; will move this one after coro.begin %f = call i8* @llvm.coro.begin( ``` After: ``` %var = alloca i32 store .. %var ;stays put %f = call i8* @llvm.coro.begin( %tmp = load %var store %tmp, %spill.slot.for.var ``` Note: This change does not handle array allocas as that is something that C++ FE does not produce, but, it can be added in the future if need arises Reviewers: llvm-commits, modocache, ben-clayton, tks2103, rjmccall Reviewed By: modocache Subscribers: bartdesmet Differential Revision: https://reviews.llvm.org/D66230 llvm-svn: 368949
* [Attributor] Try to fix "missing field 'RetInsts' initializer" warningJohannes Doerfert2019-08-141-1/+1
| | | | | | http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/35674/steps/build_Lld/logs/stdio llvm-svn: 368938
* [Attributor][NFC] Make debug output consistentJohannes Doerfert2019-08-141-4/+4
| | | | llvm-svn: 368931
* [SCEV] Rename getMaxBackedgeTakenCount to getConstantMaxBackedgeTakenCount [NFC]Philip Reames2019-08-143-5/+5
| | | | llvm-svn: 368930
* [Attributor][NFC] Try to eliminate warnings (debug build + fall through)Johannes Doerfert2019-08-141-1/+3
| | | | llvm-svn: 368928
* [Attributor][NFC] Introduce statistics macros for new positionsJohannes Doerfert2019-08-141-54/+42
| | | | llvm-svn: 368927
* [Attributor][NFC] Add merge/join/clamp operators to the IntegerStateJohannes Doerfert2019-08-141-0/+24
| | | | | | Differential Revision: https://reviews.llvm.org/D66146 llvm-svn: 368925
* [Attributor] Use the AANoNull attribute directly in AADereferenceableJohannes Doerfert2019-08-141-78/+34
| | | | | | | | | | | | | | | | | Summary: Instead of constantly keeping track of the nonnull status with the dereferenceable information we can simply query the nonnull attribute whenever we need the information (debug + manifest). Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66113 llvm-svn: 368924
* [Attributor] Use liveness during the creation of AAReturnedValuesJohannes Doerfert2019-08-141-227/+182
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: As one of the first attributes, and one of the complex ones, AAReturnedValues was not using liveness but we filtered the result after the fact. This change adds liveness usage during the creation. The algorithm is also improved and shorter. The new algorithm will collect returned values over time using the generic facilities that work with liveness already, e.g., genericValueTraversal which does not look at dead PHI node predecessors. A test to show how this leads to better results is included. Note: Unresolved calls and resolved calls are now tracked explicitly. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66120 llvm-svn: 368922
* [Attributor] Do not update or manifest dead attributesJohannes Doerfert2019-08-141-3/+23
| | | | | | | | | | | | | | | | Summary: If the associated context instruction is assumed dead we do not need to update or manifest the state. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66116 llvm-svn: 368921
* [Attributor] Use IRPosition consistentlyJohannes Doerfert2019-08-141-234/+367
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The next attempt to clean up the Attributor interface before we grow it further. Before, we used a combination of two values (associated + anchor) and an argument number (or -1) to determine a location. This was very fragile. The new system uses exclusively IR positions and we restrict the generation of IR positions to special constructor methods that verify internal constraints we have. This will catch misuse early. The auto-conversion, e.g., in getAAFor, is now performed through the SubsumingPositionIterator. This iterator takes an IR position and allows to visit all IR positions that "subsume" the given one, e.g., function attributes "subsume" argument attributes of that function. For a detailed breakdown see the class comment of SubsumingPositionIterator. This patch also introduces the IRPosition::getAttrs() to extract IR attributes at a certain position. The method knows how to look up in different positions that are equivalent, e.g., the argument position for call site arguments. We also introduce three new positions kinds such that we have all IR positions where attributes can be placed and one for "floating" values. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65977 llvm-svn: 368919
* [SLP][NFC] Use pointers to address to ScalarToTreeEntry elements, instead of ↵Dinar Temirbulatov2019-08-141-4/+4
| | | | | | indexes. llvm-svn: 368906
* [RLEV] Rewrite loop exit values for multiple exit loops w/o overall loop ↵Philip Reames2019-08-141-4/+20
| | | | | | | | | | | | exit count We already supported rewriting loop exit values for multiple exit loops, but if any of the loop exits were not computable, we gave up on all loop exit values. This patch generalizes the existing code to handle individual computable loop exits where possible. As discussed in the review, this is a starting point for figuring out a better API. The code is a bit ugly, but getting it in lets us test as we go. Differential Revision: https://reviews.llvm.org/D65544 llvm-svn: 368898
* InferAddressSpaces: Move target intrinsic handling to TTIMatt Arsenault2019-08-141-23/+15
| | | | | | | | I'm planning on handling intrinsics that will benefit from checking the address space enums. Don't bother moving the address collection for now, since those won't need th enums. llvm-svn: 368895
* InferAddressSpaces: Remove unnecessary check for ConstantIntMatt Arsenault2019-08-141-2/+2
| | | | | | The IR is invalid if this isn't a constant since immarg was added. llvm-svn: 368893
* [SLC] Dereferenceable annonation - handle valid null pointersDavid Bolvansky2019-08-141-4/+11
| | | | | | | | | | | | | | Reviewers: jdoerfert, reames Reviewed By: jdoerfert Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66161 llvm-svn: 368884
* [BuildLibCalls] Noalias annotationDavid Bolvansky2019-08-142-9/+16
| | | | | | | | | | | | | | | | Summary: I think this is better solution than annotating callsites in IC/SLC. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66217 llvm-svn: 368875
* Ignore indirect branches from callbr.Bill Wendling2019-08-141-2/+4
| | | | | | | | | | | | | | | | | | Summary: We can't speculate around indirect branches: indirectbr and invoke. The callbr instruction needs to be included here. Reviewers: nickdesaulniers, manojgupta, chandlerc Reviewed By: chandlerc Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66200 llvm-svn: 368873
* Fix "not all control paths return a value" MSVC warnings. NFCI.Simon Pilgrim2019-08-141-1/+4
| | | | llvm-svn: 368831
* Fix "not all control paths return a value" MSVC warning. NFCI.Simon Pilgrim2019-08-141-0/+1
| | | | llvm-svn: 368830
* Fix "not all control paths return a value" MSVC warnings. NFCI.Simon Pilgrim2019-08-141-0/+2
| | | | llvm-svn: 368829
* [InstCombine] Refactor getFlippedStrictnessPredicateAndConstant() out of ↵Roman Lebedev2019-08-142-32/+50
| | | | | | | | | canonicalizeCmpWithConstant(), NFCI I'd like to use it elsewhere, hopefully without reinventing the wheel. No functional change intended so far. llvm-svn: 368820
* [LV] Fold-tail flagDorit Nuzman2019-08-141-5/+13
| | | | | | | | | | | This is the compiler-flag equivalent of the Predicate pragma (https://reviews.llvm.org/D65197), to direct the vectorizer to fold the remainder-loop into the main-loop using predication. Differential Revision: https://reviews.llvm.org/D66108 Reviewers: Ayal, hsaito, fhahn, SjoerdMeije llvm-svn: 368801
* Revert '[LICM] Make Loop ICM profile aware' and 'Fix pass dependency for LICM'David L. Jones2019-08-141-75/+18
| | | | | | | This reverts r368526 (git commit 7e71aa24bc0788690fea7f0d7eab400c6a784deb) This reverts r368542 (git commit cb5a90fd314a7914cf293797bb4fd7a6841052cf) llvm-svn: 368800
* Coroutines: adjust for SVN r358739John McCall2019-08-141-4/+6
| | | | | | | CallSite has been removed in favour of CallBase. Adjust the coroutine split to account for that. llvm-svn: 368798
* Don't run a full verifier pass in coro-splitting's private pipeline.John McCall2019-08-141-1/+6
| | | | | | Potentially addresses rdar://49022293. llvm-svn: 368797
* Remove unreachable blocks before splitting a coroutine.John McCall2019-08-141-1/+19
| | | | | | | The suspend-crossing algorithm is not correct in the presence of uses that cannot be reached on some successor path from their defs. llvm-svn: 368796
* Support swifterror in coroutine lowering.John McCall2019-08-143-0/+238
| | | | | | | | | | The support for swifterror allocas should work in all lowerings. The support for swifterror arguments only really works in a lowering with prototypes where you can ensure that the prototype also has a swifterror argument; I'm not really sure how it could possibly be made to work in the switch lowering. llvm-svn: 368795
* In coro.retcon lowering, don't explode if the optimizer messes around with ↵John McCall2019-08-142-1/+28
| | | | | | the linkage of the prototype or the exact types of the yielded values. llvm-svn: 368793
* Fix a use-after-free in the coro.alloca treatment.John McCall2019-08-141-4/+10
| | | | llvm-svn: 368792
* Add intrinsics for doing frame-bound dynamic allocations within a coroutine.John McCall2019-08-143-4/+241
| | | | | | | These rely on having an allocator provided to the coroutine and thus, for now, only work in retcon lowerings. llvm-svn: 368791
* Guard dumps in the coro intrinsic validation logic behind NDEBUG checks. ↵John McCall2019-08-141-0/+14
| | | | | | dump() is not guaranteed to be defined in all builds. llvm-svn: 368790
* Generalize llvm.coro.suspend.retcon to allow an arbitrary number of ↵John McCall2019-08-143-24/+93
| | | | | | arguments to be passed back to the continuation function. llvm-svn: 368789
* Extend coroutines to support a "returned continuation" lowering.John McCall2019-08-147-271/+1432
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A quick contrast of this ABI with the currently-implemented ABI: - Allocation is implicitly managed by the lowering passes, which is fine for frontends that are fine with assuming that allocation cannot fail. This assumption is necessary to implement dynamic allocas anyway. - The lowering attempts to fit the coroutine frame into an opaque, statically-sized buffer before falling back on allocation; the same buffer must be provided to every resume point. A buffer must be at least pointer-sized. - The resume and destroy functions have been combined; the continuation function takes a parameter indicating whether it has succeeded. - Conversely, every suspend point begins its own continuation function. - The continuation function pointer is directly returned to the caller instead of being stored in the frame. The continuation can therefore directly destroy the frame when exiting the coroutine instead of having to leave it in a defunct state. - Other values can be returned directly to the caller instead of going through a promise allocation. The frontend provides a "prototype" function declaration from which the type, calling convention, and attributes of the continuation functions are taken. - On the caller side, the frontend can generate natural IR that directly uses the continuation functions as long as it prevents IPO with the coroutine until lowering has happened. In combination with the point above, the frontend is almost totally in charge of the ABI of the coroutine. - Unique-yield coroutines are given some special treatment. llvm-svn: 368788
* [SimplifyLibCalls] Add noalias from known callsitesDavid Bolvansky2019-08-131-0/+9
| | | | | | | | | | | | | | | | | | Summary: Should be fine for memcpy, strcpy, strncpy. Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: uenoku, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66135 llvm-svn: 368724
* [SLC] Improve dereferenceable bytes annotationDavid Bolvansky2019-08-131-1/+5
| | | | llvm-svn: 368715
* [InstCombine] Non-canonical clamp-like pattern handlingRoman Lebedev2019-08-131-0/+146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Given a pattern like: ``` %old_cmp1 = icmp slt i32 %x, C2 %old_replacement = select i1 %old_cmp1, i32 %target_low, i32 %target_high %old_x_offseted = add i32 %x, C1 %old_cmp0 = icmp ult i32 %old_x_offseted, C0 %r = select i1 %old_cmp0, i32 %x, i32 %old_replacement ``` it can be rewritten as more canonical pattern: ``` %new_cmp1 = icmp slt i32 %x, -C1 %new_cmp2 = icmp sge i32 %x, C0-C1 %new_clamped_low = select i1 %new_cmp1, i32 %target_low, i32 %x %r = select i1 %new_cmp2, i32 %target_high, i32 %new_clamped_low ``` Iff `-C1 s<= C2 s<= C0-C1` Also, `ULT` predicate can also be `UGE`; or `UGT` iff `C0 != -1` (+invert result) Also, `SLT` predicate can also be `SGE`; or `SGT` iff `C2 != INT_MAX` (+invert result) If `C1 == 0`, then all 3 instructions must be one-use; else at most either `%old_cmp1` or `%old_x_offseted` can have extra uses. NOTE: if we could reuse `%old_cmp1` as one of the comparisons we'll have to build, this could be less limiting. So there are two icmp's, each one with 3 predicate variants, so there are 9 fold variants: | | ULT | UGE | UGT | | SLT | https://rise4fun.com/Alive/yIJ | https://rise4fun.com/Alive/5BfN | https://rise4fun.com/Alive/INH | | SGE | https://rise4fun.com/Alive/hd8 | https://rise4fun.com/Alive/Abk | https://rise4fun.com/Alive/PlzS | | SGT | https://rise4fun.com/Alive/VYG | https://rise4fun.com/Alive/oMY | https://rise4fun.com/Alive/KrzC | {F9730206} This fold was brought up in https://reviews.llvm.org/D65148#1603922 by @dmgreen, and is needed to unblock that patch. This patch requires D65530. Reviewers: spatel, nikic, xbolva00, dmgreen Reviewed By: spatel Subscribers: hiraditya, llvm-commits, dmgreen Tags: #llvm Differential Revision: https://reviews.llvm.org/D65765 llvm-svn: 368687
* [InstCombine][NFC] Rename IsFreeToInvert() -> isFreeToInvert() for consistencyRoman Lebedev2019-08-134-18/+18
| | | | | | As per https://reviews.llvm.org/D65530#inline-592325 llvm-svn: 368686
* [InstCombine] foldXorOfICmps(): don't give up on non-single-use ICmp's if ↵Roman Lebedev2019-08-132-10/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | all users are freely invertible Summary: This is rather unconventional.. As the comment there says, we don't have much folds for xor-of-icmps, we try to turn them into an and-of-icmps, for which we have plenty of folds. But if the ICmp we need to invert is not single-use - we give up. As discussed in https://reviews.llvm.org/D65148#1603922, we may have a non-canonical CLAMP pattern, with bit match and select-of-threshold that we'll potentially clamp. As it can be seen in `canonicalize-clamp-with-select-of-constant-threshold-pattern.ll`, out of all 8 variations of the pattern, only two are **not** canonicalized into the variant with and+icmp instead of bit math. The reason is because the ICmp we need to invert is not single-use - we give up. We indeed can't perform this fold at will, the general rule is that we should not increase instruction count in InstCombine, But we wouldn't end up increasing instruction count if we can adapt every other user to the inverted value. This way the `not` we create **will** get folded, and in the end the instruction count did not increase. For that, of course, we need to look at the users of a Value, which is again rather unconventional for InstCombine :S Thus i'm proposing to be a little bit more insistive in `foldXorOfICmps()`. The alternatives would be to not create that `not`, but add duplicate code to manually invert all users; or to add some even less general combine to handle some more specific pattern[s]. Reviewers: spatel, nikic, RKSimon, craig.topper Reviewed By: spatel Subscribers: hiraditya, jdoerfert, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65530 llvm-svn: 368685
* [SimplifyLibCalls] Add dereferenceable bytes from known callsitesDavid Bolvansky2019-08-131-13/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: int mm(char *a, char *b) { return memcmp(a,b,16); } Currently: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* %a, i8* %b, i64 16) ret i32 %call } After patch: define dso_local i32 @mm(i8* nocapture readonly %a, i8* nocapture readonly %b) local_unnamed_addr #1 { entry: %call = tail call i32 @memcmp(i8* dereferenceable(16) %a, i8* dereferenceable(16) %b, i64 16) ret i32 %call } Reviewers: jdoerfert, efriedma Reviewed By: jdoerfert Subscribers: javed.absar, spatel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66079 llvm-svn: 368657
* [Attributor] Use the cached data layout directlyJohannes Doerfert2019-08-121-11/+8
| | | | | | | This removes the warning by using the new DL member. It also simplifies the code. llvm-svn: 368625
* [Attributor][NFC] Add IntegerState raw_ostream << operatorJohannes Doerfert2019-08-121-0/+5
| | | | llvm-svn: 368622
* [Attributor] Make the InformationCache an Attributor memberJohannes Doerfert2019-08-121-96/+66
| | | | | | | The functionality is not changed but the interfaces are simplified and repetition is removed. llvm-svn: 368621
* [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin ↵Wenlei He2019-08-121-1/+74
| | | | | | | | | | | | | | | | | | backends Summary: This commit fixed a race condition from multi-threaded thinLTO backends that causes non-deterministic memory corruption for a data structure used only by AutoFDO with compact binary profile. GUIDToFuncNameMap, a static data member of type DenseMap in FunctionSamples is used as a per-module mapping from function name MD5 to name string when input AutoFDO profile is in compact binary format. However with ThinLTO, we can have parallel backends modifying and accessing the class static map concurrently. The fix is to make GUIDToFuncNameMap a member of SampleProfileLoader instead of a file static data. Reviewers: wmi, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65848 llvm-svn: 368596
* [InstCombine] x /c fabs(x) -> copysign(1.0, x)David Bolvansky2019-08-121-0/+11
| | | | | | | | | | | | | | | | | | Summary: x / fabs(x) -> copysign(1.0, x) fabs(x) / x -> copysign(1.0, x) Reviewers: spatel, foad, RKSimon, efriedma Reviewed By: spatel Subscribers: lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65898 llvm-svn: 368570
* [InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): avoid ↵Roman Lebedev2019-08-121-6/+5
| | | | | | | | | | | constantexpr pitfail (PR42962) Instead of matching value and then blindly casting to BinaryOperator just to get the opcode, just match instruction and do no cast. Fixes https://bugs.llvm.org/show_bug.cgi?id=42962 llvm-svn: 368554
* Fix pass dependency for LICMWenlei He2019-08-111-6/+7
| | | | | | Expected to address buildbot failure http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/16285 caused by D65060. llvm-svn: 368542
* [LICM] Make Loop ICM profile awareWenlei He2019-08-111-17/+73
| | | | | | | | | | | | | | | | | | | | | Summary: Hoisting/sinking instruction out of a loop isn't always beneficial. Hoisting an instruction from a cold block inside a loop body out of the loop could hurt performance. This change makes Loop ICM profile aware - it now checks block frequency to make sure hoisting/sinking anly moves instruction to colder block. Test Plan: ninja check Reviewers: asbirlea, sanjoy, reames, nikic, hfinkel, vsk Reviewed By: asbirlea Subscribers: fhahn, vsk, davidxl, xbolva00, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65060 llvm-svn: 368526
* [InstCombine][NFC] Use SimplifyAddInst() instead of ↵Roman Lebedev2019-08-101-2/+2
| | | | | | SimplifyBinOp(Instruction::BinaryOps::Add, ) llvm-svn: 368521
OpenPOWER on IntegriCloud