summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Preserve load alignment and dereferenceable metadata during some transformationsArtur Pilipenko2015-11-021-5/+16
| | | | | | | | Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D13953 llvm-svn: 251809
* InstCombine: Remove ilist iterator implicit conversions, NFCDuncan P. N. Exon Smith2015-10-131-5/+5
| | | | | | | Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. llvm-svn: 250183
* inariant.group handling in GVNPiotr Padlewski2015-10-021-7/+4
| | | | | | | | | | | | The most important part required to make clang devirtualization works ( ͡°͜ʖ ͡°). The code is able to find non local dependencies, but unfortunatelly because the caller can only handle local dependencies, I had to add some restrictions to look for dependencies only in the same BB. http://reviews.llvm.org/D12992 llvm-svn: 249196
* Clean up: Refactoring the hardcoded value of 6 for ↵Larisse Voufo2015-09-181-3/+4
| | | | | | FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886) llvm-svn: 248022
* Revert "Clean up: Refactoring the hardcoded value of 6 for ↵Larisse Voufo2015-09-151-4/+3
| | | | | | FindAvailableLoadedValue()'s parameter MaxInstsToScan." for preliminary community discussion (See. D12886) llvm-svn: 247716
* Clean up: Refactoring the hardcoded value of 6 for ↵Larisse Voufo2015-09-121-3/+4
| | | | | | FindAvailableLoadedValue()'s parameter MaxInstsToScan. llvm-svn: 247497
* Fix typos.Bruce Mitchener2015-09-121-3/+3
| | | | | | | | | | Summary: This fixes a variety of typos in docs, code and headers. Subscribers: jholewinski, sanjoy, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12626 llvm-svn: 247495
* Rename Instruction::dropUnknownMetadata() to dropUnknownNonDebugMetadata()Adrian Prantl2015-08-201-1/+0
| | | | | | | | | | | and make it always preserve debug locations, since all callers wanted this behavior anyway. This is addressing a post-commit review feedback for r245589. NFC (inside the LLVM tree). llvm-svn: 245622
* Fix a bug that caused SimplifyCFG to drop DebugLocs.Adrian Prantl2015-08-201-0/+1
| | | | | | | | | | | Instruction::dropUnknownMetadata(KnownSet) is supposed to preserve all metadata in KnownSet, but the condition for DebugLocs was inverted. Most users of dropUnknownMetadata() actually worked around this by not adding LLVMContext::MD_dbg to their list of KnowIDs. This is now made explicit. llvm-svn: 245589
* [InstCombine] Actually combine AA metadata when replacing one load with anotherBjorn Steinbrink2015-07-101-2/+0
| | | | | | Fixes PR24083 llvm-svn: 241955
* [InstCombine] Employ AliasAnalysis in FindAvailableLoadedValueBjorn Steinbrink2015-07-101-1/+1
| | | | llvm-svn: 241887
* [InstCombine] Properly combine metadata when replacing a load with anotherBjorn Steinbrink2015-07-101-1/+18
| | | | | | | | Not doing this can lead to misoptimizations down the line, e.g. because of range metadata on the replacing load excluding values that are valid for the load that is being replaced. llvm-svn: 241886
* [InstCombine] Fold IntToPtr and PtrToInt into preceding loads.David Majnemer2015-05-281-5/+10
| | | | | | | | | | | Currently we only fold a BitCast into a Load when the BitCast is its only user. Do the same for any no-op cast. Differential Revision: http://reviews.llvm.org/D9152 llvm-svn: 238452
* Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.Pete Cooper2015-05-121-2/+2
| | | | | | | | We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it. Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB. llvm-svn: 237169
* [InstCombine] Canonicalize single element array storeDavid Majnemer2015-05-111-0/+9
| | | | | | | | Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9591 llvm-svn: 236969
* [InstCombine] Canonicalize single element array loadDavid Majnemer2015-05-111-0/+10
| | | | | | | | Use the element type instead of the aggregate type. Differential Revision: http://reviews.llvm.org/D9596 llvm-svn: 236968
* Update InstCombine to transform aggregate loads into scalar loads.Mehdi Amini2015-05-071-3/+32
| | | | | | | | | | | | | | | | | | | | | Summary: One step further getting aggregate loads and store being optimized properly. This will only handle struct with one element at this point. Test Plan: Added unit tests for the new supported cases. Reviewers: chandlerc, joker-eph, joker.eph, majnemer Reviewed By: majnemer Subscribers: pete, llvm-commits Differential Revision: http://reviews.llvm.org/D8339 Patch by Amaury Sechet. From: Amaury Sechet <amaury@fb.com> llvm-svn: 236695
* [CallSite] Make construction from Value* (or Instruction*) explicit.Benjamin Kramer2015-04-101-1/+1
| | | | | | | | | | | | | | | | | | | CallSite roughly behaves as a common base CallInst and InvokeInst. Bring the behavior closer to that model by making upcasts explicit. Downcasts remain implicit and work as before. Following dyn_cast as a mental model checking whether a Value *V isa CallSite now looks like this: if (auto CS = CallSite(V)) // think dyn_cast instead of: if (CallSite CS = V) This is an extra token but I think it is slightly clearer. Making the ctor explicit has the advantage of not accidentally creating nullptr CallSites, e.g. when you pass a Value * to a function taking a CallSite argument. llvm-svn: 234601
* [opaque pointer type] Change GetElementPtrInst::getIndexedType to take the ↵David Blaikie2015-03-301-2/+4
| | | | | | | | | | pointee type This pushes the use of PointerType::getElementType up into several callers - I'll essentially just have to keep pushing that up the stack until I can eliminate every call to it... llvm-svn: 233604
* Update InstCombine to transform aggregate stores into scalar stores.Mehdi Amini2015-03-141-0/+28
| | | | | | | | | | | | | | | | | | | Summary: This is a first step toward getting proper support for aggregate loads and stores. Test Plan: Added unittests Reviewers: reames, chandlerc Reviewed By: chandlerc Subscribers: majnemer, joker.eph, chandlerc, llvm-commits Differential Revision: http://reviews.llvm.org/D7780 Patch by Amaury Sechet From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 232284
* instcombine: alloca: Canonicalize scalar allocation array sizeDuncan P. N. Exon Smith2015-03-131-2/+10
| | | | | | | | | As a follow-up to r232200, add an `-instcombine` to canonicalize scalar allocations to `i32 1`. Since r232200, `iX 1` (for X != 32) are only created by RAUWs, so this shouldn't fire too often. Nevertheless, it's a cheap check and a nice cleanup. llvm-svn: 232202
* instcombine: alloca: Limit array size type promotionDuncan P. N. Exon Smith2015-03-131-9/+9
| | | | | | | | | | | | | | Move type promotion of the size of the array allocation to the end of `simplifyAllocaArraySize()`. This avoids promoting the type of the array size if it's a `ConstantInt`, since the next -instcombine iteration will drop it to a scalar allocation anyway. Similarly, this avoids promoting the type if it's an `UndefValue`, in which case the alloca gets RAUW'ed. This is NFC when considered over the lifetime of -instcombine, since it's just reducing the number of iterations needed to reach fixed point. llvm-svn: 232201
* AsmWriter: Write alloca array size explicitly (and -instcombine fixup)Duncan P. N. Exon Smith2015-03-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Write the `alloca` array size explicitly when it's non-canonical. Previously, if the array size was `iX 1` (where X is not 32), the type would mutate to `i32` when round-tripping through assembly. The testcase I added fails in `verify-uselistorder` (as well as `FileCheck`), since the use-lists for `i32 1` and `i64 1` change. (Manman Ren came across this when running `verify-uselistorder` on some non-trivial, optimized code as part of PR5680.) The type mutation started with r104911, which allowed array sizes to be something other than an `i32`. Starting with r204945, we "canonicalized" to `i64` on 64-bit platforms -- and then on every round-trip through assembly, mutated back to `i32`. I bundled a fixup for `-instcombine` to avoid r204945 on scalar allocations. (There wasn't a clean way to sequence this into two commits, since the assembly change on its own caused testcase churn, and the `-instcombine` change can't be tested without the assembly changes.) An obvious alternative fix -- change `AllocaInst::AllocaInst()`, `AsmWriter` and `LLParser` to treat `intptr_t` as the canonical type for scalar allocations -- was rejected out of hand, since this required teaching them each about the data layout. A follow-up commit will add an `-instcombine` to canonicalize the scalar allocation array size to `i32 1` rather than leaving `iX 1` alone. rdar://problem/20075773 llvm-svn: 232200
* instcombine: alloca: Remove nesting in simplifyAllocaArraySize(), NFCDuncan P. N. Exon Smith2015-03-131-27/+30
| | | | llvm-svn: 232199
* instcombine: alloca: Split out simplifyAllocaArraySize(), NFCDuncan P. N. Exon Smith2015-03-131-8/+15
| | | | | | | | Follow-up commits will change some of the logic here. Splitting into a separate function simplifies the logic by allowing early returns instead of deeper nesting. llvm-svn: 232197
* DataLayout is mandatory, update the API to reflect it with references.Mehdi Amini2015-03-101-54/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
* [IC] Turn non-null MD on pointer loads to range MD on integer loads.Charles Davis2015-02-251-4/+18
| | | | | | | | | | | | | | | | | Summary: This change fixes the FIXME that you recently added when you committed (a modified version of) my patch. When `InstCombine` combines a load and store of an pointer to those of an equivalently-sized integer, it currently drops any `!nonnull` metadata that might be present. This change replaces `!nonnull` metadata with `!range !{ 1, -1 }` metadata instead. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7621 llvm-svn: 230462
* [InstCombine] Remove unnecessary variable indexing into single-element arraysHal Finkel2015-02-201-0/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change addresses a deficiency pointed out in PR22629. To copy from the bug report: [from the bug report] Consider this code: int f(int x) { int a[] = {12}; return a[x]; } GCC knows to optimize this to movl $12, %eax ret The code generated by recent Clang at -O3 is: movslq %edi, %rax movl .L_ZZ1fiE1a(,%rax,4), %eax retq .L_ZZ1fiE1a: .long 12 # 0xc [end from the bug report] This definitely seems worth fixing. I've also seen this kind of code before (as the base case of generic vector wrapper templates with one element). The general idea is to look at the GEP feeding a load or a store, which has some variable as its first non-zero index, and determine if that index must be zero (or else an out-of-bounds access would occur). We can do this for allocas and globals with constant initializers where we know the maximum size of the underlying object. When we find such a GEP, we create a new one for the memory access with that first variable index replaced with a constant zero. Even if we can't eliminate the memory access (and sometimes we can't), it is still useful because it removes unnecessary indexing calculations. llvm-svn: 229959
* [IC] Fix a bug with the instcombine canonicalizing of loads andChandler Carruth2015-02-131-2/+9
| | | | | | | | | | | | | | | | | | | | propagating of metadata. We were propagating !nonnull metadata even when the newly formed load is no longer of a pointer type. This is clearly broken and results in LLVM failing the verifier and aborting. This patch just restricts the propagation of !nonnull metadata to when we actually have a pointer type. This bug report and the initial version of this patch was provided by Charles Davis! Many thanks for finding this! We still need to add logic to round-trip the metadata correctly if we combine from pointer types to integer types and then back by using range metadata for the integer type loads. But this is the minimal and safe version of the patch, which is important so we can backport it into 3.6. llvm-svn: 229029
* [PM] Rename InstCombine.h to InstCombineInternal.h in preparation forChandler Carruth2015-01-221-1/+1
| | | | | | | | | | | | | | | | creating a non-internal header file for the InstCombine pass. I thought about calling this InstCombiner.h or in some way more clearly associating it with the InstCombiner clas that it is primarily defining, but there are several other utility interfaces defined within this for InstCombine. If, in the course of refactoring, those end up moving elsewhere or going away, it might make more sense to make this the combiner's header alone. Naturally, this is a bikeshed to a certain degree, so feel free to lobby for a different shade of paint if this name just doesn't suit you. llvm-svn: 226783
* [canonicalize] Teach InstCombine to canonicalize loads which are onlyChandler Carruth2015-01-221-0/+29
| | | | | | | | | | | | | | | | | | | | | | | ever stored to always use a legal integer type if one is available. Regardless of whether this particular type is good or bad, it ensures we don't get weird differences in generated code (and resulting performance) from "equivalent" patterns that happen to end up using a slightly different type. After some discussion on llvmdev it seems everyone generally likes this canonicalization. However, there may be some parts of LLVM that handle it poorly and need to be fixed. I have at least verified that this doesn't impede GVN and instcombine's store-to-load forwarding powers in any obvious cases. Subtle cases are exactly what we need te flush out if they remain. Also note that this IR pattern should already be hitting LLVM from Clang at least because it is exactly the IR which would be produced if you used memcpy to copy a pointer or floating point between memory instead of a variable. llvm-svn: 226781
* [canonicalize] Move a helper function further up the file so it can beChandler Carruth2015-01-221-47/+47
| | | | | | used earlier. NFC. llvm-svn: 226777
* [canonicalization] Refactor how we create new stores into a helperChandler Carruth2015-01-211-38/+48
| | | | | | | function. This is a bit tidier anyways and will make a subsquent patch simpler as I want to add another case to this combine. llvm-svn: 226746
* [PM] Split the AssumptionTracker immutable pass into two separate APIs:Chandler Carruth2015-01-041-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the *many* utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131
* Loading from null is valid outside of addrspace 0Philip Reames2014-12-291-10/+10
| | | | | | | | | | This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen. This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces. We really should introduce a hook to control this property on a per target per address space basis. We may be loosing valuable optimizations in some address spaces by being too conservative. Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me. llvm-svn: 224961
* Revert r223764 which taught instcombine about integer-based elment extractionChandler Carruth2014-12-091-349/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is *terrible* for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked *extensively* at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We *can* effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813
* Teach instcombine to canonicalize "element extraction" from a load of anChandler Carruth2014-12-091-41/+349
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is *terrible* for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked *extensively* at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We *can* effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 llvm-svn: 223764
* [InstCombine] Change LLVM To canonicalize toward the value type beingChandler Carruth2014-11-251-100/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | stored rather than the pointer type. This change is analogous to r220138 which changed the canonicalization for loads. The rationale is the same: memory does not have a type, operations (and thus the values they produce) have a type. We should match that type as closely as possible rather than reading some form of semantics into the pointer type. With this change, loads and stores should no longer be made with nonsensical types for the values that tehy load and store. This is particularly important when trying to match specific loaded and stored types in the process of doing other instcombines, which is what led me down this twisty maze of miscanonicalization. I've put quite some effort into looking through IR to find places where LLVM's optimizer was being unreasonably conservative in the face of mismatched load and store types, however it is possible (let's say, likely!) I have missed some. If you see regressions here, or from r220138, the likely cause is some part of LLVM failing to cope with load and store types differing. Test cases appreciated, it is important that we root all of these out of LLVM. llvm-svn: 222748
* Revert r220349 to re-instate r220277 with a fix for PR21330 -- quiteChandler Carruth2014-11-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | clearly only exactly equal width ptrtoint and inttoptr casts are no-op casts, it says so right there in the langref. Make the code agree. Original log from r220277: Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 222739
* Revert "IR: MDNode => Value"Duncan P. N. Exon Smith2014-11-111-2/+2
| | | | | | | | | | | | | | | | | Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711
* IR: MDNode => Value: Instruction::getAllMetadata()Duncan P. N. Exon Smith2014-11-011-2/+2
| | | | | | | Change `Instruction::getAllMetadata()` to modify a vector of `Value` instead of `MDNode` and update call sites. This is part of PR21433. llvm-svn: 221027
* Revert "Teach the load analysis to allow finding available values which ↵Hans Wennborg2014-10-211-2/+1
| | | | | | | | require" (r220277) This seems to have caused PR21330. llvm-svn: 220349
* Preserve 'nonnull' when changing type of the load.Philip Reames2014-10-211-0/+1
| | | | | | | | When changing the type of a load in Chandler's recent InstCombine changes, we can preserve the new 'nonnull' metadata. I considered adding an assert since 'nonnull' is only valid on pointer types, but casting a pointer to a non-pointer would involve more than a bitcast anyways. If someone extends this transform to handle more than bitcasts, the verifier will report the malformed IR, so a separate assertion isn't needed. Also, the fpmath flags would have the same problem. llvm-svn: 220324
* Teach the load analysis to allow finding available values which requireChandler Carruth2014-10-211-1/+2
| | | | | | | | | | | | | | | | | | | | inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 220277
* Introduce enum values for previously defined metadata types. (NFC)Philip Reames2014-10-211-6/+2
| | | | | | | | | | | Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison. This change adds enum value for three existing metadata types: + MD_nontemporal = 9, // "nontemporal" + MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access" + MD_nonnull = 11 // "nonnull" I went through an updated various uses as well. I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate. For example, there were several items in LoopInfo.cpp I chose not to update. llvm-svn: 220248
* Teach the load analysis driving core instcombine logic and other bits ofChandler Carruth2014-10-201-1/+2
| | | | | | | | | | | | | | | | | | | logic to look through pointer casts, making them trivially stronger in the face of loads and stores with intervening pointer casts. I've included a few test cases that demonstrate the kind of folding instcombine can do without pointer casts and then variations which obfuscate the logic through bitcasts. Without this patch, the variations all fail to optimize fully. This is more important now than it has been in the past as I've started moving the load canonicialization to more closely follow the value type requirements rather than the pointer type requirements and thus this needs to be prepared for more pointer casts. When I made the same change to stores several test cases regressed without logic along these lines so I wanted to systematically improve matters first. llvm-svn: 220178
* Do a better and more complete job of preserving metadata when combiningChandler Carruth2014-10-191-8/+58
| | | | | | | | | | | | | | | | | | | | loads. This handles many more cases than just the AA metadata, some of them suggested by Hal in his review of the AA metadata handling patch. I've tried to test this behavior where tractable to do so. I'll point out that I have specifically *not* included a test for debuginfo because it was going to require 2 or 3 times as much work to craft some input which would survive the "helpful" stripping of debug info metadata that doesn't match the desired schema. This is another good example of why the current state of write-ability for our debug info metadata is unacceptable. I spent over 30 minutes trying to conjure some test case that would survive, even copying from other debug info tests, but it always failed to survive with no explanation of why or how I might fix it. =[ llvm-svn: 220165
* Preserve AA metadata when combining (cast (load (...))) -> (load (castChandler Carruth2014-10-181-0/+3
| | | | | | (...))). llvm-svn: 220141
* [InstCombine] Do an about-face on how LLVM canonicalizes (cast (loadChandler Carruth2014-10-181-72/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ...)) and (load (cast ...)): canonicalize toward the former. Historically, we've tried to load using the type of the *pointer*, and tried to match that type as closely as possible removing as many pointer casts as we could and trading them for bitcasts of the loaded value. This is deeply and fundamentally wrong. Repeat after me: memory does not have a type! This was a hard lesson for me to learn working on SROA. There is only one thing that should actually drive the type used for a pointer, and that is the type which we need to use to load from that pointer. Matching up pointer types to the loaded value types is very useful because it minimizes the physical size of the IR required for no-op casts. Similarly, the only thing that should drive the type used for a loaded value is *how that value is used*! Again, this minimizes casts. And in fact, the *only* thing motivating types in any part of LLVM's IR are the types used by the operations in the IR. We should match them as closely as possible. I've ended up removing some tests here as they were testing bugs or behavior that is no longer present. Mostly though, this is just cleanup to let the tests continue to function as intended. The only fallout I've found so far from this change was SROA and I have fixed it to not be impeded by the different type of load. If you find more places where this change causes optimizations not to fire, those too are likely bugs where we are assuming that the type of pointers is "significant" for optimization purposes. llvm-svn: 220138
* Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)Hal Finkel2014-09-071-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
OpenPOWER on IntegriCloud