summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar/SROA.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Move Value.isDereferenceablePointer to ValueTracking [NFC]Philip Reames2015-04-231-3/+3
| | | | | | | | | | | Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking. This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality. Patch by: Artur Pilipenko <apilipenko@azulsystems.com> Differential Revision: reviews.llvm.org/D9075 llvm-svn: 235611
* DebugInfo: Drop rest of DIDescriptor subclassesDuncan P. N. Exon Smith2015-04-211-6/+6
| | | | | | | Delete the remaining subclasses of (the already deleted) `DIDescriptor`. Part of PR23080. llvm-svn: 235404
* DebugInfo: Require a DebugLoc in DIBuilder::insertDeclare()Duncan P. N. Exon Smith2015-04-151-6/+5
| | | | | | | | | | | | | | | | | | | | | Change `DIBuilder::insertDeclare()` and `insertDbgValueIntrinsic()` to take an `MDLocation*`/`DebugLoc` parameter which it attaches to the created intrinsic. Assert at creation time that the `scope:` field's subprogram matches the variable's. There's a matching `clang` commit to use the API. The context for this is PR22778, which is removing the `inlinedAt:` field from `MDLocalVariable`, instead deferring to the `!dbg` location attached to the debug info intrinsic. The best way to ensure we always have a `!dbg` attachment is to require one at creation time. I'll be adding verifier checks next, but this API change is the best way to shake out frontend bugs. Note: I added an `llvm_unreachable()` in `bindings/go` and passed in `nullptr` for the `DebugLoc`. The `llgo` folks will eventually need to pass a valid `DebugLoc` here. llvm-svn: 235041
* DebugInfo: Gut DIExpressionDuncan P. N. Exon Smith2015-04-141-4/+4
| | | | | | | | | | | | | | | | Completely gut `DIExpression`, turning it into a simple wrapper around `MDExpression *`. There are two bits of magic left: - It's constructed from `const MDExpression*` but convertible to `MDExpression*`. - It's default-constructed to `nullptr`. Otherwise, it should behave quite like a raw pointer. Once I've done the same to the rest of the `DIDescriptor` subclasses, I'll come back to delete them entirely (and update call sites as necessary to deal with the missing magic). llvm-svn: 234832
* [opaque pointer type] More GEP IRBuilder API migrations...David Blaikie2015-04-031-4/+7
| | | | llvm-svn: 234058
* DataLayout is mandatory, update the API to reflect it with references.Mehdi Amini2015-03-101-37/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
* Make DataLayout Non-Optional in the ModuleMehdi Amini2015-03-041-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: DataLayout keeps the string used for its creation. As a side effect it is no longer needed in the Module. This is "almost" NFC, the string is no longer canonicalized, you can't rely on two "equals" DataLayout having the same string returned by getStringRepresentation(). Get rid of DataLayoutPass: the DataLayout is in the Module The DataLayout is "per-module", let's enforce this by not duplicating it more than necessary. One more step toward non-optionality of the DataLayout in the module. Make DataLayout Non-Optional in the Module Module->getDataLayout() will never returns nullptr anymore. Reviewers: echristo Subscribers: resistor, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D7992 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231270
* Replace std::copy with a back inserter with vector append where feasibleBenjamin Kramer2015-02-281-1/+1
| | | | | | | | | All of the cases were just appending from random access iterators to a vector. Using insert/append can grow the vector to the perfect size directly and moves the growing out of the loop. No intended functionalty change. llvm-svn: 230845
* Debug info: When updating debug info during SROA, do not emit debug infoAdrian Prantl2015-02-091-8/+18
| | | | | | | | | | for any padding introduced by SROA. In particular, do not emit debug info for an alloca that represents only the padding introduced by a previous iteration. Fixes PR22495. llvm-svn: 228632
* Debug info: Use DW_OP_bit_piece instead of DW_OP_piece in theAdrian Prantl2015-02-091-5/+5
| | | | | | | | | | | intermediate representation. This - increases consistency by using the same granularity everywhere - allows for pieces < 1 byte - DW_OP_piece didn't actually allow storing an offset. Part of PR22495. llvm-svn: 228631
* Fix PR22393. When recursively replacing an aggregate with a smallerAdrian Prantl2015-02-011-3/+12
| | | | | | | | aggregate or scalar, the debug info needs to refer to the absolute offset (relative to the entire variable) instead of storing the offset inside the smaller aggregate. llvm-svn: 227702
* Reapply: Teach SROA how to update debug info for fragmented variables.Adrian Prantl2015-01-201-8/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reapplies r225379. ChangeLog: - The assertion that this commit previously ran into about the inability to handle indirect variables has since been removed and the backend can handle this now. - Testcases were upgrade to the new MDLocation format. - Instead of keeping a DebugDeclares map, we now use llvm::FindAllocaDbgDeclare(). Original commit message follows. Debug info: Teach SROA how to update debug info for fragmented variables. This allows us to generate debug info for extremely advanced code such as typedef struct { long int a; int b;} S; int foo(S s) { return s.b; } which at -O1 on x86_64 is codegen'd into define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 { ret i32 %s.coerce1, !dbg !24 } with this patch we emit the following debug info for this TAG_formal_parameter [3] AT_location( 0x00000000 0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004 0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 ) AT_name( "s" ) AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" ) Thanks to chandlerc, dblaikie, and echristo for their feedback on all previous iterations of this patch! llvm-svn: 226598
* Revert "Reapply: Teach SROA how to update debug info for fragmented variables."Adrian Prantl2015-01-081-60/+8
| | | | | | | This reverts commit r225379 while investigating an assertion failure reported by Alexey. llvm-svn: 225424
* Reapply: Teach SROA how to update debug info for fragmented variables.Adrian Prantl2015-01-071-8/+60
| | | | | | | | The two buildbot failures were addressed in LLVM r225378 and CFE r225359. This rapplies commit 225272 without modifications. llvm-svn: 225379
* Revert "Reapply: Teach SROA how to update debug info for fragmented variables."Adrian Prantl2015-01-061-60/+8
| | | | | | | | | because of a tsan buildbot failure. This reverts commit 225272. Fix should be coming soon. llvm-svn: 225288
* Reapply: Teach SROA how to update debug info for fragmented variables.Adrian Prantl2015-01-061-8/+60
| | | | | | | | | | This also rolls in the changes discussed in http://reviews.llvm.org/D6766. Defers migrating the debug info for new allocas until after all partitions are created. Thanks to Chandler for reviewing! llvm-svn: 225272
* [SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, anChandler Carruth2015-01-051-11/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | assert out of the new pre-splitting in SROA. This fix makes the code do what was originally intended -- when we have a store of a load both dealing in the same alloca, we force them to both be pre-split with identical offsets. This is really quite hard to do because we can keep discovering problems as we go along. We have to track every load over the current alloca which for any resaon becomes invalid for pre-splitting, and go back to remove all stores of those loads. I've included a couple of test cases derived from PR22093 that cover the different ways this can happen. While that PR only really triggered the first of these two, its the same fundamental issue. The other challenge here is documented in a FIXME now. We end up being quite a bit more aggressive for pre-splitting when loads and stores don't refer to the same alloca. This aggressiveness comes at the cost of introducing potentially redundant loads. It isn't clear that this is the right balance. It might be considerably better to require that we only do pre-splitting when we can presplit every load and store involved in the entire operation. That would give more consistent if conservative results. Unfortunately, it requires a non-trivial change to the actual pre-splitting operation in order to correctly handle cases where we end up pre-splitting stores out-of-order. And it isn't 100% clear that this is the right direction, although I'm starting to suspect that it is. llvm-svn: 225149
* [PM] Split the AssumptionTracker immutable pass into two separate APIs:Chandler Carruth2015-01-041-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | a cache of assumptions for a single function, and an immutable pass that manages those caches. The motivation for this change is two fold. Immutable analyses are really hacks around the current pass manager design and don't exist in the new design. This is usually OK, but it requires that the core logic of an immutable pass be reasonably partitioned off from the pass logic. This change does precisely that. As a consequence it also paves the way for the *many* utility functions that deal in the assumptions to live in both pass manager worlds by creating an separate non-pass object with its own independent API that they all rely on. Now, the only bits of the system that deal with the actual pass mechanics are those that actually need to deal with the pass mechanics. Once this separation is made, several simplifications become pretty obvious in the assumption cache itself. Rather than using a set and callback value handles, it can just be a vector of weak value handles. The callers can easily skip the handles that are null, and eventually we can wrap all of this up behind a filter iterator. For now, this adds boiler plate to the various passes, but this kind of boiler plate will end up making it possible to port these passes to the new pass manager, and so it will end up factored away pretty reasonably. llvm-svn: 225131
* [SROA] Teach SROA to be more aggressive in splitting now that we haveChandler Carruth2015-01-021-27/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a pre-splitting pass over loads and stores. Historically, splitting could cause enough problems that I hamstrung the entire process with a requirement that splittable integer loads and stores must cover the entire alloca. All smaller loads and stores were unsplittable to prevent chaos from ensuing. With the new pre-splitting logic that does load/store pair splitting I introduced in r225061, we can now very nicely handle arbitrarily splittable loads and stores. In order to fully benefit from these smarts, we need to mark all of the integer loads and stores as splittable. However, we don't actually want to rewrite partitions with all integer loads and stores marked as splittable. This will fail to extract scalar integers from aggregates, which is kind of the point of SROA. =] In order to resolve this, what we really want to do is only do pre-splitting on the alloca slices with integer loads and stores fully splittable. This allows us to uncover all non-integer uses of the alloca that would benefit from a split in an integer load or store (and where introducing the split is safe because it is just memory transfer from a load to a store). Once done, we make all the non-whole-alloca integer loads and stores unsplittable just as they have historically been, repartition and rewrite. The result is that when there are integer loads and stores anywhere within an alloca (such as from a memcpy of a sub-object of a larger object), we can split them up if there are non-integer components to the aggregate hiding beneath. I've added the challenging test cases to demonstrate how this is able to promote to scalars even a case where we have even *partially* overlapping loads and stores. This restores the single-store behavior for small arrays of i8s which is really nice. I've restored both the little endian testing and big endian testing for these exactly as they were prior to r225061. It also forced me to be more aggressive in an alignment test to actually defeat SROA. =] Without the added volatiles there, we actually split up the weird i16 loads and produce nice double allocas with better alignment. This also uncovered a number of bugs where we failed to handle splittable load and store slices which didn't have a begininng offset of zero. Those fixes are included, and without them the existing test cases explode in glorious fireworks. =] I've kept support for leaving whole-alloca integer loads and stores as splittable even for the purpose of rewriting, but I think that's likely no longer needed. With the new pre-splitting, we might be able to remove all the splitting support for loads and stores from the rewriter. Not doing that in this patch to try to isolate any performance regressions that causes in an easy to find and revert chunk. llvm-svn: 225074
* [SROA] Make the computation of adjusted pointers not leak GEPChandler Carruth2015-01-021-10/+14
| | | | | | | | | | | | | | | | | | | | | | | | | instructions. I noticed this when working on dialing up how aggressively we can pre-split loads and stores. My test case wasn't passing because dead GEPs into the allocas persisted when they were built by this routine. This isn't terribly harmful, we still rewrote and promoted the alloca and I can't conceive of how to cause this to happen in a case where we will keep the exact same alloca but rewrite and promote the uses of it. If that ever happened, we'd get an assert out of mem2reg. So I don't have a direct test case yet, but the subsequent commit's test case wouldn't pass without this. There are other problems fixed by this patch that I spotted purely by inspection such as the fact that getAdjustedPtr could have actually deleted dead base pointers. I don't know how to get a base pointer to go into getAdjustedPtr today, so I think this bug could never have manifested (and I certainly can't write a test case for it) but, it wasn't the intent of the code. The code really just wanted to GC the new instructions built. That can be done more directly by comparing with the base pointer which is the only non-new instruction that this code can return. llvm-svn: 225073
* [SROA] Fix the loop exit placement to be prior to indexing the splitsChandler Carruth2015-01-021-4/+8
| | | | | | | | array. This prevents it from walking out of bounds on the splits array. Bug found with the existing tests by ASan and by the MSVC debug build. llvm-svn: 225069
* [SROA] Fix two total think-os in r225061 that should have been caught onChandler Carruth2015-01-011-2/+2
| | | | | | | | | | | | | | | | | | | | | a +asserts bootstrap, but my bootstrap had asserts off. Oops. Anyways, in some places it is reasonable to cast (as a sanity check) the pointer operand to a load or store to an instruction within SROA -- namely when the pointer operand is expected to be derived from an alloca, and thus always an instruction. However, the pre-splitting code also deals with loads and stores to non-alloca pointers and there we need to just use the Value*. Nothing about the code relied on the instruction cast, it was only there essentially as an invariant assertion. Remove the two that don't actually hold. This should fix the proximate issue in PR22080, but I'm also doing an asserts bootstrap myself to see if there are other issues lurking. I'll craft a reduced test case in a moment, but I wanted to get the tree healthy as quickly as possible. llvm-svn: 225068
* [SROA] Switch to using a more direct debug logging technique in one partChandler Carruth2015-01-011-4/+6
| | | | | | | | | | | | | | of my new load and store splitting, and fix a bug where it logged a totally irrelevant slice rather than the actual slice in question. The logging here previously worked because we used to place new slices onto the back of the core sequence, but that caused other problems. I updated the actual code to store new slices in their own vector but didn't update the logging. There isn't a good way to reuse the logging any more, and frankly it wasn't needed. We can directly log this bit more easily. llvm-svn: 225063
* [SROA] Fix formatting with clang-format which I managed to fail to doChandler Carruth2015-01-011-48/+48
| | | | | | prior to committing r225061. Sorry for that. llvm-svn: 225062
* [SROA] Teach SROA how to much more intelligently handle split loads andChandler Carruth2015-01-011-2/+484
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | stores. When there are accesses to an entire alloca with an integer load or store as well as accesses to small pieces of the alloca, SROA splits up the large integer accesses. In order to do that, it uses bit math to merge the small accesses into large integers. While this is effective, it produces insane IR that can cause significant problems in the rest of the optimizer: - It can cause load and store mismatches with GVN on the non-alloca side where we end up loading an i64 (or some such) rather than loading specific elements that are stored. - We can't always get rid of the integer bit math, which is why we can't always fix the loads and stores to work well with GVN. - This is especially bad when we have operations that mix poorly with integer bit math such as floating point operations. - It will block things like the vectorizer which might be able to handle the scalar stores that underly the aggregate. At the same time, we can't just directly split up these loads and stores in all cases. If there is actual integer arithmetic involved on the values, then using integer bit math is actually the perfect lowering because we can often combine it heavily with the surrounding math. The solution this patch provides is to find places where SROA is partitioning aggregates into small elements, and look for splittable loads and stores that it can split all the way to some other adjacent load and store. These are uniformly the cases where failing to split the loads and stores hurts the optimizer that I have seen, and I've looked extensively at the code produced both from more and less aggressive approaches to this problem. However, it is quite tricky to actually do this in SROA. We may have loads and stores to the same alloca, or other complex patterns that are hard to handle. This complexity leads to the somewhat subtle algorithm implemented here. We have to do this entire process as a separate pass over the partitioning of the alloca, and split up all of the loads prior to splitting the stores so that we can handle safely the cases of overlapping, including partially overlapping, loads and stores to the same alloca. We also have to reconstitute the post-split slice configuration so we can avoid iterating again over all the alloca uses (the slow part of SROA). But we also have to ensure that when we split up loads and stores to *other* allocas, we *do* re-iterate over them in SROA to adapt to the more refined partitioning now required. With this, I actually think we can fix a long-standing TODO in SROA where I avoided splitting as many loads and stores as probably should be splittable. This limitation historically mitigated the fallout of all the bad things mentioned above. Now that we have more intelligent handling, I plan to remove the FIXME and more aggressively mark integer loads and stores as splittable. I'll do that in a follow-up patch to help with bisecting any fallout. The net result of this change should be more fine-grained and accurate scalars being formed out of aggregates. At the very least, Clang now generates perfect code for this high-level test case using std::complex<float>: #include <complex> void g1(std::complex<float> &x, float a, float b) { x += std::complex<float>(a, b); } void g2(std::complex<float> &x, float a, float b) { x -= std::complex<float>(a, b); } void foo(const std::complex<float> &x, float a, float b, std::complex<float> &x1, std::complex<float> &x2) { std::complex<float> l1 = x; g1(l1, a, b); std::complex<float> l2 = x; g2(l2, a, b); x1 = l1; x2 = l2; } This code isn't just hypothetical either. It was reduced out of the hot inner loops of essentially every part of the Eigen math library when using std::complex<float>. Those loops would consistently and pervasively hop between the floating point unit and the integer unit due to bit math extraction and insertion of floating point values that were "stored" in a 64-bit integer register around the loop backedge. So far, this change has passed a bootstrap and I have done some other testing and so far, no issues. That doesn't mean there won't be though, so I'll be prepared to help with any fallout. If you performance swings in particular, please let me know. I'm very curious what all the impact of this change will be. Stay tuned for the follow-up to also split more integer loads and stores. llvm-svn: 225061
* [SROA] Update the documentation and names for accessing the slicesChandler Carruth2014-12-241-29/+36
| | | | | | | | | | | | | | | | | | | | | within a partition of an alloca in SROA. This reflects the fact that the organization of the slices isn't really ideal for analysis, but is the naive way in which the slices are available while we're processing them in the core partitioning algorithm. It is possible we could improve matters, and I've left a FIXME with one of my ideas for how to do this, but it is a lot of work, the benefit is somewhat minor, and it isn't clear that it would be strictly better. =/ Not really satisfying, but I'm out of really good ideas. This also improves one place where the debug logging failed to mark some split partitions. Now we log in one place, slightly later, and with accurate information about whether the slice is split by the partition being rewritten. llvm-svn: 224800
* [SROA] Refactor the integer and vector promotion testing logic toChandler Carruth2014-12-241-47/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | operate in terms of the new Partition class, and generally have a more clear set of arguments. No functionality changed. The most notable improvements here are consistently using the terminology of 'partition' for a collection of slices that will be rewritten together and 'slice' for a region of an alloca that is used by a particular instruction. This also makes it more clear that the split things are actually slices as well, just ones that will be split by the proposed partition. This doesn't yet address the confusing aspects of the partition's interface where slices that will be split by the partition and start prior to the partition are accesssed via Partition::splitSlices() while the core range of slices exposed by a Partition includes both unsplit slices and slices which will be split by the end, but started within the offset range of the partition. This is particularly hard to address because the algorithm which computes partitions quite literally doesn't know which slices these will end up being until too late. I'm looking at whether I can fix that or not, but I'm not optimistic. I'll update the comments and/or names to further explain this either way. I've also added one FIXME in this patch relating to this confusion so that I don't forget about it. llvm-svn: 224798
* Revert r224739: Debug info: Teach SROA how to update debug info forChandler Carruth2014-12-231-30/+1
| | | | | | | | | | | fragmented variables. This caused codegen to start crashing when we built somewhat large programs with debug info and optimizations. 'check-msan' hit in, and I suspect a bootstrap would as well. I mailed a test case to the review thread. llvm-svn: 224750
* [SROA] Lift the logic for traversing the alloca slices one partition atChandler Carruth2014-12-221-157/+303
| | | | | | | | | | | | | | | | | | | | | | | | a time into a partition iterator and a Partition class. There is a lot of knock-on simplification that this enables, largely stemming from having a Partition object to refer to in lots of helpers. I've only done a minimal amount of that because enoguh stuff is changing as-is in this commit. This shouldn't change any observable behavior. I've worked hard to preserve the *exact* traversal semantics which were originally present even though some of them make no sense. I'll be changing some of this in subsequent commits now that the logic is carefully factored into a reusable place. The primary motivation for this change is to break the rewriting into phases in order to support more intelligent rewriting. For example, I'm planning to change how split loads and stores are rewritten to remove the significant overuse of integer bit packing in the resulting code and allow more effective secondary splitting of aggregates. For any of this to work, they have to share the exact traversal logic. llvm-svn: 224742
* Debug info: Teach SROA how to update debug info for fragmented variables.Adrian Prantl2014-12-221-1/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to generate debug info for extremely advanced code such as typedef struct { long int a; int b;} S; int foo(S s) { return s.b; } which at -O1 on x86_64 is codegen'd into define i32 @foo(i64 %s.coerce0, i32 %s.coerce1) #0 { ret i32 %s.coerce1, !dbg !24 } with this patch we emit the following debug info for this TAG_formal_parameter [3] AT_location( 0x00000000 0x0000000000000000 - 0x0000000000000006: rdi, piece 0x00000008, rsi, piece 0x00000004 0x0000000000000006 - 0x0000000000000008: rdi, piece 0x00000008, rax, piece 0x00000004 ) AT_name( "s" ) AT_decl_file( "/Volumes/Data/llvm/_build.ninja.release/test.c" ) Thanks to chandlerc, dblaikie, and echristo for their feedback on all previous iterations of this patch! llvm-svn: 224739
* [SROA] Run clang-format over the entire SROA pass as I wrote it beforeChandler Carruth2014-12-201-157/+138
| | | | | | | | | | | | much of the glory of clang-format, and now any time I touch it I risk introducing formatting changes as part of a functional commit. Also, clang-format is *way* better at formatting my code than I am. Most of this is a huge improvement although I reverted a couple of places where I hit a clang-format bug with lambdas that has been filed but not (fully) fixed. llvm-svn: 224666
* [SROA] Cleanup - remove the use of std::mem_fun_ref nonsense and useChandler Carruth2014-12-181-1/+3
| | | | | | a lambda now that we have them. llvm-svn: 224500
* IR: Split Metadata from ValueDuncan P. N. Exon Smith2014-12-091-7/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Split `Metadata` away from the `Value` class hierarchy, as part of PR21532. Assembly and bitcode changes are in the wings, but this is the bulk of the change for the IR C++ API. I have a follow-up patch prepared for `clang`. If this breaks other sub-projects, I apologize in advance :(. Help me compile it on Darwin I'll try to fix it. FWIW, the errors should be easy to fix, so it may be simpler to just fix it yourself. This breaks the build for all metadata-related code that's out-of-tree. Rest assured the transition is mechanical and the compiler should catch almost all of the problems. Here's a quick guide for updating your code: - `Metadata` is the root of a class hierarchy with three main classes: `MDNode`, `MDString`, and `ValueAsMetadata`. It is distinct from the `Value` class hierarchy. It is typeless -- i.e., instances do *not* have a `Type`. - `MDNode`'s operands are all `Metadata *` (instead of `Value *`). - `TrackingVH<MDNode>` and `WeakVH` referring to metadata can be replaced with `TrackingMDNodeRef` and `TrackingMDRef`, respectively. If you're referring solely to resolved `MDNode`s -- post graph construction -- just use `MDNode*`. - `MDNode` (and the rest of `Metadata`) have only limited support for `replaceAllUsesWith()`. As long as an `MDNode` is pointing at a forward declaration -- the result of `MDNode::getTemporary()` -- it maintains a side map of its uses and can RAUW itself. Once the forward declarations are fully resolved RAUW support is dropped on the ground. This means that uniquing collisions on changing operands cause nodes to become "distinct". (This already happened fairly commonly, whenever an operand went to null.) If you're constructing complex (non self-reference) `MDNode` cycles, you need to call `MDNode::resolveCycles()` on each node (or on a top-level node that somehow references all of the nodes). Also, don't do that. Metadata cycles (and the RAUW machinery needed to construct them) are expensive. - An `MDNode` can only refer to a `Constant` through a bridge called `ConstantAsMetadata` (one of the subclasses of `ValueAsMetadata`). As a side effect, accessing an operand of an `MDNode` that is known to be, e.g., `ConstantInt`, takes three steps: first, cast from `Metadata` to `ConstantAsMetadata`; second, extract the `Constant`; third, cast down to `ConstantInt`. The eventual goal is to introduce `MDInt`/`MDFloat`/etc. and have metadata schema owners transition away from using `Constant`s when the type isn't important (and they don't care about referring to `GlobalValue`s). In the meantime, I've added transitional API to the `mdconst` namespace that matches semantics with the old code, in order to avoid adding the error-prone three-step equivalent to every call site. If your old code was: MDNode *N = foo(); bar(isa <ConstantInt>(N->getOperand(0))); baz(cast <ConstantInt>(N->getOperand(1))); bak(cast_or_null <ConstantInt>(N->getOperand(2))); bat(dyn_cast <ConstantInt>(N->getOperand(3))); bay(dyn_cast_or_null<ConstantInt>(N->getOperand(4))); you can trivially match its semantics with: MDNode *N = foo(); bar(mdconst::hasa <ConstantInt>(N->getOperand(0))); baz(mdconst::extract <ConstantInt>(N->getOperand(1))); bak(mdconst::extract_or_null <ConstantInt>(N->getOperand(2))); bat(mdconst::dyn_extract <ConstantInt>(N->getOperand(3))); bay(mdconst::dyn_extract_or_null<ConstantInt>(N->getOperand(4))); and when you transition your metadata schema to `MDInt`: MDNode *N = foo(); bar(isa <MDInt>(N->getOperand(0))); baz(cast <MDInt>(N->getOperand(1))); bak(cast_or_null <MDInt>(N->getOperand(2))); bat(dyn_cast <MDInt>(N->getOperand(3))); bay(dyn_cast_or_null<MDInt>(N->getOperand(4))); - A `CallInst` -- specifically, intrinsic instructions -- can refer to metadata through a bridge called `MetadataAsValue`. This is a subclass of `Value` where `getType()->isMetadataTy()`. `MetadataAsValue` is the *only* class that can legally refer to a `LocalAsMetadata`, which is a bridged form of non-`Constant` values like `Argument` and `Instruction`. It can also refer to any other `Metadata` subclass. (I'll break all your testcases in a follow-up commit, when I propagate this change to assembly.) llvm-svn: 223802
* SROA: The alloca type isn't a candidate promotion type for vectorsDavid Majnemer2014-11-211-3/+2
| | | | | | | | | | | | The alloca's type is irrelevant, only those types which are used in a load or store of the exact size of the slice should be considered. This manifested as an assertion failure when we compared the various types: we had a size mismatch. This fixes PR21480. llvm-svn: 222499
* Update SetVector to rely on the underlying set's insert to return a ↵David Blaikie2014-11-191-7/+7
| | | | | | | | | | | | | pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334
* [SROA] Change how SROA does vector-based promotion of allocas to handleChandler Carruth2014-10-181-44/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | cases where the alloca type, the load types, and the store types used all disagree. Previously, the only way that vector-based promotion occured was if the alloca type was a vector type. This was one of the *very* few remaining uses of the alloca's type to guide SROA/mem2reg left in LLVM. It turns out it was a bad idea. The alloca type can change very easily based on the mixture of types loaded and stored to that alloca. We shouldn't be relying on it as a signal for very much. Instead, the source of truth should be loads and stores. We should canonicalize the loads and stores as much as possible and then rely on them exclusively in SROA. When looking and loads and stores, we may find many different candidate vector types. This change will let SROA try all of them to find a vector type which is a viable way to promote the entire alloca to a vector register. With this change, it becomes possible to do better canonicalization and optimization of loads and stores without breaking SROA in random ways, and that should allow fixing a core source of performance loss in hot numerical loops such as those in Eigen. llvm-svn: 220116
* [SROA] Switch the common variable name for the 'AllocaSlices' class toChandler Carruth2014-10-161-40/+42
| | | | | | | | | | | 'AS'. Using 'S' as this was a terrible idea. Arguably, 'AS' is not much better, but it at least follows the idea of using initialisms and removes active confusion about the AllocaSlices variable and a Slice variable. llvm-svn: 219963
* [SROA] More range-based cleanups to SROA, these brought to you byChandler Carruth2014-10-161-25/+12
| | | | | | | | | clang-modernize. I did have to clean up the variable types and whitespace a bit because the use of auto made the code much less readable here. llvm-svn: 219962
* [SROA] Switch a couple of overly complex iterator accessors to just beChandler Carruth2014-10-161-26/+10
| | | | | | | | | ArrayRef accessors. I think this even came up in review that this was over-engineered, and indeed it was. Time to un-build it. llvm-svn: 219958
* [SROA] Start more deeply moving SROA to use ranges rather than justChandler Carruth2014-10-161-45/+42
| | | | | | | | | | | | iterators. There are a ton of places where it essentially wants ranges rather than just iterators. This is just the first step that adds the core slice range typedefs and uses them in a couple of places. I still have to explicitly construct them because they've not been punched throughout the entire set of code. More range-based cleanups incoming. llvm-svn: 219955
* Move the complex address expression out of DIVariable and into an extraAdrian Prantl2014-10-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! Note: I accidentally committed a bogus older version of this patch previously. llvm-svn: 218787
* Revert r218778 while investigating buldbot breakage.Adrian Prantl2014-10-011-2/+2
| | | | | | "Move the complex address expression out of DIVariable and into an extra" llvm-svn: 218782
* Move the complex address expression out of DIVariable and into an extraAdrian Prantl2014-10-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | argument of the llvm.dbg.declare/llvm.dbg.value intrinsics. Previously, DIVariable was a variable-length field that has an optional reference to a Metadata array consisting of a variable number of complex address expressions. In the case of OpPiece expressions this is wasting a lot of storage in IR, because when an aggregate type is, e.g., SROA'd into all of its n individual members, the IR will contain n copies of the DIVariable, all alike, only differing in the complex address reference at the end. By making the complex address into an extra argument of the dbg.value/dbg.declare intrinsics, all of the pieces can reference the same variable and the complex address expressions can be uniqued across the CU, too. Down the road, this will allow us to move other flags, such as "indirection" out of the DIVariable, too. The new intrinsics look like this: declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr) declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr) This patch adds a new LLVM-local tag to DIExpressions, so we can detect and pretty-print DIExpression metadata nodes. What this patch doesn't do: This patch does not touch the "Indirect" field in DIVariable; but moving that into the expression would be a natural next step. http://reviews.llvm.org/D4919 rdar://problem/17994491 Thanks to dblaikie and dexonsmith for reviewing this patch! llvm-svn: 218778
* Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)Hal Finkel2014-09-071-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change, which allows @llvm.assume to be used from within computeKnownBits (and other associated functions in ValueTracking), adds some (optional) parameters to computeKnownBits and friends. These functions now (optionally) take a "context" instruction pointer, an AssumptionTracker pointer, and also a DomTree pointer, and most of the changes are just to pass this new information when it is easily available from InstSimplify, InstCombine, etc. As explained below, the significant conceptual change is that known properties of a value might depend on the control-flow location of the use (because we care that the @llvm.assume dominates the use because assumptions have control-flow dependencies). This means that, when we ask if bits are known in a value, we might get different answers for different uses. The significant changes are all in ValueTracking. Two main changes: First, as with the rest of the code, new parameters need to be passed around. To make this easier, I grouped them into a structure, and I made internal static versions of the relevant functions that take this structure as a parameter. The new code does as you might expect, it looks for @llvm.assume calls that make use of the value we're trying to learn something about (often indirectly), attempts to pattern match that expression, and uses the result if successful. By making use of the AssumptionTracker, the process of finding @llvm.assume calls is not expensive. Part of the structure being passed around inside ValueTracking is a set of already-considered @llvm.assume calls. This is to prevent a query using, for example, the assume(a == b), to recurse on itself. The context and DT params are used to find applicable assumptions. An assumption needs to dominate the context instruction, or come after it deterministically. In this latter case we only handle the specific case where both the assumption and the context instruction are in the same block, and we need to exclude assumptions from being used to simplify their own ephemeral values (those which contribute only to the assumption) because otherwise the assumption would prove its feeding comparison trivial and would be removed. This commit adds the plumbing and the logic for a simple masked-bit propagation (just enough to write a regression test). Future commits add more patterns (and, correspondingly, more regression tests). llvm-svn: 217342
* SROA: Don't insert instructions before a PHIDavid Majnemer2014-09-011-1/+4
| | | | | | | | | | | | | | | SROA may decide that it needs to insert a bitcast and would set it's insertion point before a PHI. This will create an invalid module right quick. Instead, choose the first insertion point in the basic block that holds our PHI. This fixes PR20822. Differential Revision: http://reviews.llvm.org/D5141 llvm-svn: 216891
* [SROA] Fold a PHI node if all its incoming values are the sameJingyue Wu2014-08-221-41/+41
| | | | | | | | | | | | | | | | | | | Summary: Fixes PR20425. During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote. Test Plan: Added three more tests in phi-and-select.ll Reviewers: nlewycky, eliben, meheff, chandlerc Reviewed By: chandlerc Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits Differential Revision: http://reviews.llvm.org/D4659 llvm-svn: 216299
* SROA: Handle a case of store size being smaller than allocation sizeReid Kleckner2014-08-221-4/+6
| | | | | | | | | | | | | | | | In this case, we are creating an x86_fp80 slice for a union from C where the padding bytes may contain real data. An x86_fp80 alloca is 16 bytes, and that's just fine. We can't, however, use regular loads and stores to access the slice, because the store size is only 10 bytes / 80 bits. Instead, use memcpy and memset. Fixes PR18726. Reviewed By: chandlerc Differential Revision: http://reviews.llvm.org/D5012 llvm-svn: 216248
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-211-3/+3
| | | | | | needing to mention the size. llvm-svn: 216158
* Revert "Repace SmallPtrSet with SmallPtrSetImpl in function arguments to ↵Craig Topper2014-08-181-3/+3
| | | | | | | | avoid needing to mention the size." Getting a weird buildbot failure that I need to investigate. llvm-svn: 215870
* Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid ↵Craig Topper2014-08-171-3/+3
| | | | | | needing to mention the size. llvm-svn: 215868
OpenPOWER on IntegriCloud