summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert BCmp Loop Idiom recognition transform (PR43870)Roman Lebedev2019-11-021-877/+8
| | | | | | | | | | | | | | | | | As discussed in https://bugs.llvm.org/show_bug.cgi?id=43870, this transform is missing a crucial legality check: the old (non-countable) loop would early-return upon first mismatch, but there is no such guarantee for bcmp/memcmp. We'd need to ensure that [PtrA, PtrA+NBytes) and [PtrB, PtrB+NBytes) are fully dereferenceable memory regions. But that would limit the transform to constant loop trip counts and would further cripple it because dereferenceability analysis is *very* partial. Furthermore, even if all that is done, every single test would need to be rewritten from scratch. So let's just give up.
* [Attributor] Ignore BlockAddress users in call site traversalJohannes Doerfert2019-11-021-0/+4
| | | | | | | BlockAddress users will not "call" the function so they do not qualify as call sites in the first place. When we delete a function with BlockAddress users we need to first remove the body so they are properly discarded.
* [Attributor][FIX] Do not try to cast if a cast is not requiredJohannes Doerfert2019-11-021-2/+6
| | | | | | | | | When we replace constant returns at the call site we did issue a cast in the hopes it would be a no-op if the types are equal. Turns out that is not the case and we have to check it ourselves first. Reused an IPConstantProp test for coverage. No functional change to the test wrt. IPConstantProp.
* [Attributor][FIX] Transform invoke of nounwind to call + br %normal_destJohannes Doerfert2019-11-021-5/+22
| | | | | | | | | Even if the invoked function may-return, we can replace it with a call and branch if it is nounwind. We had almost everything in place to do this but did not which actually caused a crash when we removed the landingpad from the actually dead unwind block. Exposed by the IPConstantProp tests.
* [Attributor][FIX] Make "known" and "assumed" liveness explicitJohannes Doerfert2019-11-021-21/+43
| | | | | | | | We did merge "known" and "assumed" liveness information into a single set which caused various kinds of problems, especially because we did not properly record when something was actually known. With this patch we properly track the "known" bit and distinguish dead ends we know from the ones we still need to explore in future updates.
* [Attributor] `willreturn` + `noreturn` = UBJohannes Doerfert2019-11-021-4/+1
| | | | | | | | | | | We gave up on `noreturn` if `willreturn` was known for a while but we now again try to always derive `noreturn`. This is useful because a function that is `noreturn` + `willreturn` is basically dead as executing it would lead to undefined behavior (UB). This came up in the IPConstantProp cases where a function only contained a unreachable but was not marked `noreturn` which caused missed opportunities down the line.
* [Attributor][FIX] Make AAValueSimplifyArgument aware of thread-dependent ↵Johannes Doerfert2019-11-021-3/+13
| | | | | | | constants As in IPConstantProp, thread-dependent constants need not be propagated over callbacks. Took the comment and test from there, see also D56447.
* [Attributor][FIX] Handle the default case of a switchJohannes Doerfert2019-11-021-5/+5
| | | | | | In D69605 only the "cases" of a switch were handled but if none matched we did not make the default case life. This is fixed now and properly tested (with code from IPConstantProp/user-with-multiple-uses.ll).
* [Attributor][FIX] Make value simplification aware of "complicated" attributesJohannes Doerfert2019-11-021-0/+18
| | | | | | We cannot simply replace arguments that carry attributes like `nest`, `inalloca`, `sret`, and `byval`. Except for the last one, which we can replace if it is not written, we bail for now.
* [Attributor][NFCI] Avoid unnecessary work except for testingJohannes Doerfert2019-11-021-1/+12
| | | | | | | | Trying to deduce information for declarations and calls sites of declarations is not useful in practice but only for testing. Add a flag that disables this by default but also enable it in the tests. The misc.ll test will verify the flag "works" as expected.
* [Attributor][FIX] NoCapture is not a subsuming propertyJohannes Doerfert2019-11-021-5/+12
| | | | | | We cannot look at the subsuming positions and take their nocapture bit as shown with the two tests for which we derived nocapture on the call site argument and readonly on the argument of the second before.
* [Attributor][NFCI] Remove obsolete codeJohannes Doerfert2019-11-021-24/+0
| | | | | The code in question does not add anything as the class is a subclass of AACallSiteReturnedFromReturnedAndMustBeExecutedContext already.
* Recommit "[ThinLTO] Handle GUID collision in import global processing""Teresa Johnson2019-11-011-5/+16
| | | | | | | | This recommits cc0b9647b76178bc3869bbfff80535ad86366472 which was reverted in d39d1a2f87aca3cfabe58ecfa5146879baa70096. I added a fix for an issue found when testing via distributed ThinLTO, and added a test case for that failure.
* Revert "[LLD][ThinLTO] Handle GUID collision in import global processing"Teresa Johnson2019-11-011-11/+5
| | | | | | | This reverts commit cc0b9647b76178bc3869bbfff80535ad86366472. The commit is causing a failure in internal testing. Will recommit with a fix later.
* [Attributor] Really use the executed-contextJohannes Doerfert2019-10-311-14/+29
| | | | | | | | | | | | | | | | | | Before we did not follow casts and geps when we looked at the users of a pointer in the pointers must-be-executed-context. This caused us to fail to determine if it was accessed for sure. With this change we follow such users now. The above extension exposed problems in getKnownNonNullAndDerefBytesForUse which did not always check what the base pointer was. We also did not handle negative offsets as conservative as we have to without explicit loop handling. Finally, we should not derive a huge number if we access a pointer that was traversed backwards first. The problems exposed by this functional change are already tested in the existing test cases as is the functional change. Differential Revision: https://reviews.llvm.org/D69647
* [SLP] Vectorize jumbled stores.Alexey Bataev2019-10-311-20/+96
| | | | | | | | | | | | | Summary: Patch adds support for vectorization of the jumbled stores. The value operands are vectorized and then shuffled in the right order before store. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43339
* [Attributor] Make AANonNull perform context sensitive queriesJohannes Doerfert2019-10-311-21/+6
| | | | | | | | | | | | | | | | | Summary: In order to get context sensitivity from isKnownNonZero we need to provide a context instruction *and* a dominator tree. The latter is passed now to which actually allows to remove some initialization code. Tests taken from PR43833. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69595
* [IPCP] Bail on extractvalue's with more than 1 index.Craig Topper2019-10-311-1/+1
| | | | | | | | | | | | | | The replacement code only looks at the first index of the extractvalue. If there are additional indices we'll end up doing a bad replacement. This only happens if the function returns a nested struct. Not sure if clang ever generates such code. The original report came from ispc. Fixes PR43857 Differential Revision: https://reviews.llvm.org/D69656
* [InstCombine] simplify fcmp+select canonicalization; NFCISanjay Patel2019-10-311-20/+4
| | | | | We had 2 blocks of code that are nearly identical. Existing regression tests should cover both of the patterns.
* [InstCombine] Canonicalize uadd.with.overflow to uadd.satDavid Green2019-10-311-0/+32
| | | | | | | | | | | This adds some patterns to transform uadd.with.overflow to uadd.sat (with usub.with.overflow to usub.sat too). The patterns selects from UINTMAX (or 0 for subs) depending on whether the operation overflowed. Signed patterns are a little more involved (they can wrap in two directions), but can be added here in a followup patch too. Differential Revision: https://reviews.llvm.org/D69245
* [LICM] Invalidate SCEV upon instruction hoistingSerguei Katkov2019-10-311-14/+20
| | | | | | | | | | | Since SCEV can cache information about location of an instruction, it should be invalidated when the instruction is moved. There should be similar bug in code sinking part of LICM, it will be fixed in a follow-up change. Patch Author: Daniil Suchkov Reviewers: asbirlea, mkazantsev, reames Reviewed By: asbirlea Subscribers: hiraditya, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D69370
* Revert "[SLP] Vectorize jumbled stores."Haojian Wu2019-10-311-91/+16
| | | | | | This reverts commit 21d498c9c0f32dcab5bc89ac593aa813b533b43a. This commit causes some crashes on some targets.
* [Attributor][NFCI] Improve the usage of IntegerStatesJohannes Doerfert2019-10-311-12/+8
| | | | | Setting the upper bound directly in the state can be beneficial and simplifies the logic. This also exposed more copy&paste type errors.
* [Attributor] Make liveness "edge-based"Johannes Doerfert2019-10-311-126/+186
| | | | | | | | | | | | | | | | Summary: If control is transferred to a successor is the key question when it comes to liveness. The new implementation puts that question in the focus and thereby providing a clean way to assume certain CFG edges are dead or instructions will not transfer control. Reviewers: sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69605
* [Attributor] Liveness for valuesJohannes Doerfert2019-10-311-31/+321
| | | | | | | | | | | | | | | | | | | | | | | Summary: This patch introduces liveness (AAIsDead) for all positions, thus for all kinds of values. For now, we say an instruction is dead if it would be removed assuming all users are dead. A call site return is different as we just look at the users. If all call site returns have been eliminated, the return values can return undef instead of their original value, eliminating uses. We try to recursively delete dead instructions now and we introduce a simple check interface for use-traversal. This is the idea tried out in D68626 but implemented in the right way. Reviewers: uenoku, sstefan1 Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68925
* [Attributor][NFC] Do not delete dead blocks but "clear" themJohannes Doerfert2019-10-311-3/+6
| | | | | | | Deleting blocks will require us to deal with dead edges, e.g., `br i1 false, label %live, label %dead` explicitly. For now we just clear the blocks and move on. This will be revisited once we actually fold branches.
* [Attributor] Add "free"-based heap2stack deductionJohannes Doerfert2019-10-301-18/+43
| | | | | | | | | | | | | | | Summary: If there is a unique free of the allocated that has to be reached from the malloc, we can apply the heap-2-stack transformation even if the pointer escapes. Reviewers: hfinkel, sstefan1, uenoku Subscribers: hiraditya, bollu, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68958
* [Attributor][NFC] Eagerly mark attributes as fixed.Johannes Doerfert2019-10-301-4/+14
| | | | | | If an attribute did not query any optimistic (=non-fixed) information to justify its state, we know the attribute state will not change anymore. Thus, we can indicate an optimistic fixpoint.
* [Attributor][NFC] Do not record dependences on fixed attributesJohannes Doerfert2019-10-301-0/+6
| | | | | Since fixed values cannot change, we do not need to wait for it to happen, we will never notify the dependent attribute anyway.
* [Attributor][NFC] Simplify the IRPosition interfaceJohannes Doerfert2019-10-301-4/+6
| | | | | | | We pretended IRPosition came either as mutable or immutable objects while they are basically always immutable, with a single (existing) unfortunate exceptions. This patch cleans up the uses to deal with the immutable version.
* [ThinLTO/WPD] Fix index-based WPD for available_externally vtablesTeresa Johnson2019-10-301-8/+26
| | | | | | | | | | | | | | | | | | | | | | Summary: Clang does not add type metadata to available_externally vtables. When choosing a summary to look at for virtual function definitions, make sure we skip summaries for any available externally vtables as they will not describe any virtual function functions, which are only summarized in the presence of type metadata on the vtable def. Simply look for the corresponding strong def's summary. Also add handling for same-named local vtables with the same GUID because of same-named files without enough distinguishing path. In that case we return a conservative result with no devirtualization. Reviewers: pcc, davidxl, evgeny777 Subscribers: mehdi_amini, inglorion, hiraditya, steven_wu, dexonsmith, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69452
* Revert "[CodeView] Add option to disable inline line tables."Amy Huang2019-10-301-30/+11
| | | | | | because it breaks compiler-rt tests. This reverts commit 6d03890384517919a3ba7fe4c35535425f278f89.
* [CodeView] Add option to disable inline line tables.Amy Huang2019-10-301-11/+30
| | | | | | | | | | | | | | | | | Summary: This adds a clang option to disable inline line tables. When it is used, the inliner uses the call site as the location of the inlined function instead of marking it as an inline location with the function location. See https://bugs.llvm.org/show_bug.cgi?id=42344 Reviewers: rnk Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67723
* [InstCombine] keep assumption before sinking callstyker2019-10-311-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: in the following C code the branch is not removed by clang in O3. ``` int f1(char* p) { int i1 = __builtin_strlen(p); if (!p) return -1; return i1; } ``` The issue is that the call to strlen is sunk to the following block by instcombine. In its new place the call to strlen doesn't dominate the use in the icmp anymore so value tracking can't see that p cannot be null. This patch resolves the issue by inserting an assumption at the place of the call before sinking a call when that call can be used to prove an argument to be nonnull. This resolves this issue at O3. Reviewers: majnemer, xbolva00, fhahn, jdoerfert, spatel, efriedma Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69477
* [SLP] Vectorize jumbled stores.Alexey Bataev2019-10-301-16/+91
| | | | | | | | | | | | | Summary: Patch adds support for vectorization of the jumbled stores. The value operands are vectorized and then shuffled in the right order before store. Reviewers: RKSimon, spatel, hfinkel, mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43339
* [SLPVectorizer] Use getAPInt() for comparison. NFCI.Simon Pilgrim2019-10-301-1/+1
| | | | Technically integers can assert on getZExtValue() if beyond i64 range, and a fuzzer usually find this.....
* [AddressSanitizer] Only instrument globals of default address spaceKarl-Johan Karlsson2019-10-301-0/+2
| | | | | | | | | | | | | The address sanitizer ignore memory accesses from different address spaces, however when instrumenting globals the check for different address spaces is missing. This result in assertion failure. The fault was found in an out of tree target. The patch skip all globals of non default address space. Reviewed By: leonardchan, vitalybuka Differential Revision: https://reviews.llvm.org/D68790
* [SLP] Fix -Wunused-variable. NFCFangrui Song2019-10-291-2/+1
|
* [SLP] Generalization of stores vectorization.Alexey Bataev2019-10-291-76/+72
| | | | | | | | | Stores are vectorized with maximum vectorization factor of 16. Patch tries to improve the situation and use maximal vectorization factor. Reviewers: spatel, RKSimon, mkuper, hfinkel Differential Revision: https://reviews.llvm.org/D43582
* [InstCombine] make icmp vector canonicalization safe for constant with undef ↵Sanjay Patel2019-10-291-0/+12
| | | | | | | | | | | | | | | | | elements This is a fix for: https://bugs.llvm.org/show_bug.cgi?id=43730 ...and as shown there, we have existing test cases that show potential miscompiles. We could just bail out for vector constants that contain any undef elements, or we can do as shown here: allow the transform, but replace the undefs with a safe value. For most of the tests shown, this results in a full splat constant (no undefs) which is probably a win for further IR analysis because we conservatively don't match undefs in most cases. Codegen can probably recover these kinds of undef lanes via demanded elements analysis if that's profitable. Differential Revision: https://reviews.llvm.org/D69519
* [IR] move helper function to replace undef constant (elements) with fixed ↵Sanjay Patel2019-10-291-20/+2
| | | | | | | | constants This is the NFC part of D69519. We had this functionality locally in instcombine, but it can be used elsewhere, so hoisting it to Constant class.
* [LCSSA] Forget values we create LCSSA phis forFlorian Hahn2019-10-291-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently we only forget the loop we added LCSSA phis for. But SCEV expressions in other loops could also depend on the instruction we added a PHI for and currently we do not invalidate those expressions. This can happen when we use ScalarEvolution before converting a function to LCSSA form. The SCEV expressions will refer to the non-LCSSA value. If this SCEV expression is then used with the expander, we do not preserve LCSSA form. This patch properly forgets the values we created PHIs for. Those need to be recomputed again. This patch fixes PR43458. Currently SCEV::verify does not catch this mismatch and any test would need to run multiple passes to trigger the error (e.g. -loop-reduce -loop-unroll). I will also look into catching this kind of mismatch in the verifier. Also, we currently forget the whole loop in LCSSA and I'll check if we can be more surgical. Reviewers: efriedma, sanjoy.google, reames Reviewed By: efriedma Subscribers: zzheng, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68194
* [Attributor] Make IntegerState more flexibleJohannes Doerfert2019-10-281-26/+17
| | | | | | | | | | | | | To make IntegerState more flexible but also less error prone we split it up into (1) incrementing, (2) decrementing, and (3) bit-tracking states. This adds functionality compared to before and disallows misuse, e.g., "incrementing" updates on a bit-tracking state. Part of the change is a single operator in the base class which simplifies helper functions that deal with states. There are certain functional changes but all of which should actually be corrections.
* [msan] Remove more attributes from sanitized functions.Evgenii Stepanov2019-10-281-2/+8
| | | | | | | | | | | | | | | | | | Summary: MSan instrumentation adds stores and loads even to pure readonly/writeonly functions. It is removing some of those attributes from instrumented functions and call targets, but apparently not enough. Remove writeonly, argmemonly and speculatable in addition to readonly / readnone. Reviewers: pcc, vitalybuka Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69541
* [PGO][PGSO] SizeOpts changes.Hiroshi Yamauchi2019-10-281-15/+53
| | | | | | | | | | | | | | | Summary: (Split of off D67120) SizeOpts/MachineSizeOpts changes for profile guided size optimization. (A second try after previously committed as r375254 and reverted as r375375.) Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69409
* Convert files added in d157a9bc8ba1 to unix line endings.Nico Weber2019-10-282-315/+315
| | | | | | Ran: git show --diff-filter=A --stat d157a9bc8ba1 | grep '|' | \ awk '{ print $1 }' | xargs dos2unix
* [LV] Interleaving should not exceed estimated loop trip count.Craig Topper2019-10-281-12/+12
| | | | | | | | | | Currently we may do iterleaving by more than estimated trip count coming from the profile or computed maximum trip count. The solution is to use "best known" trip count instead of exact one in interleaving analysis. Patch by Evgeniy Brevnov. Differential Revision: https://reviews.llvm.org/D67948
* [utils] InlineFunction: fix for debug info affecting optimizationsBjorn Pettersson2019-10-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Debug info affects output from "opt -inline", InlineFunction could not handle the llvm.dbg.value when it exist between alloca instructions. Problem was that the first alloca in a sequence of allocas was handled differently from the subsequence alloca instructions. Now all static alloca instructions are treated the same (being removed if the have no uses). So it does not matter if there are dbg instructions (or any other instructions) in between. Fix the issue: https://bugs.llvm.org/show_bug.cgi?id=43291k Patch by: yechunliang (Chris Ye) Reviewers: bjope, jmorse, vsk, probinson, jdoerfert, mtrofin, aprantl, fhahn Reviewed By: bjope Subscribers: uabelho, ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68633
* [InstCombine] Extra combine for uadd_satDavid Green2019-10-281-0/+7
| | | | | | | This is an extra fold for a canonical form of uadd_sat, as shown in D68651. It essentially selects uadd from an add and a select. Differential Revision: https://reviews.llvm.org/D69244
* Add Windows Control Flow Guard checks (/guard:cf).Andrew Paverd2019-10-285-1/+339
| | | | | | | | | | | | | | | | | | | Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761
OpenPOWER on IntegriCloud