summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/CodeGenPrepare.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Unmerge GEPs to reduce register pressure on IndirectBr edges.Hiroshi Yamauchi2017-09-111-0/+167
| | | | | | | | | | | | | | | | | | | | | | | Summary: GEP merging can sometimes increase the number of live values and register pressure across control edges and cause performance problems particularly if the increased register pressure results in spills. This change implements GEP unmerging around an IndirectBr in certain cases to mitigate the issue. This is in the CodeGenPrepare pass (after all the GEP merging has happened.) With this patch, the Python interpreter loop runs faster by ~5%. Reviewers: sanjoy, hfinkel Reviewed By: hfinkel Subscribers: eastig, junbuml, llvm-commits Differential Revision: https://reviews.llvm.org/D36772 llvm-svn: 312930
* [CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use ↵Eugene Zelenko2017-08-291-66/+151
| | | | | | warnings; other minor fixes (NFC). llvm-svn: 312053
* [Instruction] add moveAfter() convenience function; NFCISanjay Patel2017-08-291-6/+3
| | | | | | | | | As suggested in D37121, here's a wrapper for removeFromParent() + insertAfter(), but implemented using moveBefore() for symmetry/efficiency. Differential Revision: https://reviews.llvm.org/D37239 llvm-svn: 312001
* Move helper classes into anonymous namespaces.Benjamin Kramer2017-08-201-0/+2
| | | | | | No functionality change intended. llvm-svn: 311288
* [CGP] Fix the rematerialization of gc.relocatesSerguei Katkov2017-08-171-0/+15
| | | | | | | | | | | | | | | | | | | If we want to substitute the relocation of derived pointer with gep of base then we must ensure that relocation of base dominates the relocation of derived pointer. Currently only check for basic block is present. However it is possible that both relocation are in the same basic block but relocation of derived pointer is defined earlier. The patch moves the relocation of base pointer right before relocation of derived pointer in this case. Reviewers: sanjoy,artagnon,igor-laevsky,reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36462 llvm-svn: 311067
* [CGP] use narrower types in memcmp expansion when possibleSanjay Patel2017-08-011-1/+6
| | | | | | | | This only affects very small memcmp on x86 for now, but it will become more important if we allow vector-sized load and compares. llvm-svn: 309711
* [CGP] use subtract or subtract-of-cmps for result of memcmp expansionSanjay Patel2017-07-311-6/+18
| | | | | | | | | | | | | | | | | | | | | | As noted in the code comment, transforming this in the other direction might require a separate transform here in CGP given the block-at-a-time DAG constraint. Besides that theoretical motivation, there are 2 practical motivations for the subtract-of-cmps form: 1. The codegen for both x86 and PPC is better for this IR (though PPC could be better still). There is discussion about canonicalizing IR to the select form ( http://lists.llvm.org/pipermail/llvm-dev/2017-July/114885.html ), so we probably need to add DAG transforms for those patterns anyway, but this improves the memcmp output without waiting for that step. 2. If we allow vector-sized chunks for the load and compare, x86 is better prepared to convert that to optimal code when using subtract-of-cmps, so another prerequisite patch is avoided if we choose to enable that. Differential Revision: https://reviews.llvm.org/D34904 llvm-svn: 309597
* Guard print() functions only used by dump() functions.Florian Hahn2017-07-311-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Since r293359, most dump() function are only defined when `!defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)` holds. print() functions only used by dump() functions are now unused in release builds, generating lots of warnings. This patch only defines some print() functions if they are used. Reviewers: MatzeB Reviewed By: MatzeB Subscribers: arsenm, mzolotukhin, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D35949 llvm-svn: 309553
* [CodeGenPrepare] Cut off FindAllMemoryUses if there are too many uses.Benjamin Kramer2017-07-241-4/+13
| | | | | | | | | | | | | | | | | | | This avoids excessive compile time. The case I'm looking at is Function.cpp from an old version of LLVM that still had the giant memcmp string matcher in it. Before r308322 this compiled in about 2 minutes, after it, clang takes infinite* time to compile it. With this patch we're at 5 min, which is still bad but this is a pathological case. The cut off at 20 uses was chosen by looking at other cut-offs in LLVM for user scanning. It's probably too high, but does the job and is very unlikely to regress anything. Fixes PR33900. * I'm impatient and aborted after 15 minutes, on the bug report it was killed after 2h. llvm-svn: 308891
* [CGP] Allow cycles during Phi traversal in OptimizaMemoryInstSerguei Katkov2017-07-191-5/+11
| | | | | | | | | | | | | | Allowing cycles in Phi traversal increases the scope of optimize memory instruction in case we are in loop. The added test shows an example of enabling optimization inside a loop. Reviewers: loladiro, spatel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35294 llvm-svn: 308419
* [CGP] Cleanup - remove redundant code in OptimizeMemoryInst. NFCSerguei Katkov2017-07-181-35/+12
| | | | | | | | | | | | | | | | | | optimizeMemoryInst contains a vector AddrModeInsts. The only use of this vector is to check that all instructions are in the same block as memory instruction. This check is guarded by PhiSeen flag, so if we traversed through phi node then we do not need to keep information in AddrModeInsts. AddModeInsts is set first time we found some addressing mode and updated if we found new one later. We can find next addressing mode only if we traverse phi node so all code related to update of AddModeInsts can be safely removed. Reviewers: loladiro, spatel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35291 llvm-svn: 308265
* [TTI] Refine the cost of EXT in getUserCost()Haicheng Wu2017-07-151-19/+1
| | | | | | | | | | | Now, getUserCost() only checks the src and dst types of EXT to decide it is free or not. This change first checks the types, then calls isExtFreeImpl(), and check if EXT can form ExtLoad at last. Currently, only AArch64 has customized implementation of isExtFreeImpl() to check if EXT can be folded into its use. Differential Revision: https://reviews.llvm.org/D34458 llvm-svn: 308076
* [CodeGenPrepare] Don't create dead instructions in addrmode sinkingEli Friedman2017-07-121-12/+18
| | | | | | | | | | When we fail to sink an instruction, we must make sure not to modify the function; otherwise, we end up in an infinite loop because CodeGenPrepare iterates until it doesn't make any changes. Fixes https://bugs.llvm.org/show_bug.cgi?id=33608 . llvm-svn: 307866
* [CGP] Relax a bit restriction for optimizeMemoryInst to extend scopeSerguei Katkov2017-07-111-2/+5
| | | | | | | | | | | | | | | | | | CodeGenPrepare::optimizeMemoryInst contains a check that we do nothing if all instructions combining the address for memory instruction is in the same block as memory instruction itself. However if any of these instruction are placed after memory instruction then address calculation will not be folded to memory instruction. The added test case shows an example. Reviewers: loladiro, spatel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34862 llvm-svn: 307628
* [CodeGenPrepare] Don't create inttoptr for ni ptrsKeno Fischer2017-06-291-8/+23
| | | | | | | | | | | | | Summary: Arguably non-integral pointers probably shouldn't show up here at all, but since the backend doesn't complain and this takes valid (according to the Verifier) IR and makes it invalid, make sure not to introduce any inttoptr instructions if we're dealing with non-integral pointers. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33110 llvm-svn: 306737
* [CGP] add specialization for memcmp expansion with only one basic blockSanjay Patel2017-06-271-1/+41
| | | | llvm-svn: 306485
* [CGP] eliminate a sub instruction in memcmp expansionSanjay Patel2017-06-271-5/+2
| | | | | | | | | | | | | | | | | | | | | | As noted in D34071, there are some IR optimization opportunities that could be handled by normal IR passes if this expansion wasn't happening so late in CGP. Regardless of that, it seems wasteful to knowingly produce suboptimal IR here, so I'm proposing this change: %s = sub i32 %x, %y %r = icmp ne %s, 0 => %r = icmp ne %x, %y Changing the predicate to 'eq' mimics what InstCombine would do, so that's just an efficiency improvement if we decide this expansion should happen sooner. The fact that the PowerPC backend doesn't eliminate the 'subf.' might be something for PPC folks to investigate separately. Differential Revision: https://reviews.llvm.org/D34416 llvm-svn: 306471
* [CGP] simplify code to get bswap in memcmp expansion; NFCISanjay Patel2017-06-271-3/+1
| | | | llvm-svn: 306452
* [CGP] add an IR builder to memcmp expansion class instead of recreating it; NFCISanjay Patel2017-06-271-19/+6
| | | | | | | This was a clean-up suggestion from: https://reviews.llvm.org/D34005 llvm-svn: 306438
* fix trivial typos, NFCHiroshi Inoue2017-06-271-1/+1
| | | | | | succesor -> successor llvm-svn: 306393
* [CGP, memcmp] replace CreateZextOrTrunc with CreateZext because it can never ↵Sanjay Patel2017-06-211-5/+7
| | | | | | trunc llvm-svn: 305936
* [CGP] fix variables to be unsigned in memcmp expansionSanjay Patel2017-06-211-12/+14
| | | | llvm-svn: 305935
* [CGP, PowerPC] try to constant fold before creating loads for memcmp expansionSanjay Patel2017-06-191-3/+13
| | | | | | | | | | | | | | | This is the last step needed to avoid regressions for x86 before we flip the switch to allow expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, so we need to do that here too. FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 constant tests because its alignment requirements are too strict (shouldn't require alignment for a constant operand). Differential Revision: https://reviews.llvm.org/D34071 llvm-svn: 305734
* Allow -profile-guided-section-prefix more than onceDavid Callahan2017-06-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: At present, `-profile-guided-section-prefix` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same "false" value) which results in an error, after which compilation fails: ``` clang (LLVM option parsing): for the -profile-guided-section-prefix option: may only occur zero or one times! ``` While we work on improving our build system, it also seems reasonable just to allow `-profile-guided-section-prefix` to be passed more than once, by to `cl::ZeroOrMore`. Quoting [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed | the documentation ]]: > The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times. > ... > If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained. Reviewers: danielcdh Reviewed By: danielcdh Subscribers: twoh, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D34219 llvm-svn: 305413
* [CGP] add a reference to DataLayout in MemCmpExpansion; NFCISanjay Patel2017-06-091-20/+22
| | | | | | | | We're currently passing endian-ness around as a param (and not uniformly), so this eliminates the need for that. I'd like to add a constant fold call too, and that requires a DL. llvm-svn: 305129
* fix formatting; NFCSanjay Patel2017-06-081-6/+6
| | | | llvm-svn: 305008
* [CGP] don't expand a memcmp with nobuiltin attributeSanjay Patel2017-06-081-6/+4
| | | | | | | | | | | | | | | | This matches the behavior used in the SDAG when expanding memcmp. For reference, we're intentionally treating the earlier fortified call transforms differently after: https://bugs.llvm.org/show_bug.cgi?id=23093 https://reviews.llvm.org/rL233776 One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers: https://reviews.llvm.org/D19781 https://reviews.llvm.org/D19801 Differential Revision: https://reviews.llvm.org/D34043 llvm-svn: 305007
* [CGP / PowerPC] avoid multi-block overhead for simple memcmp expansionSanjay Patel2017-06-081-21/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 llvm-svn: 304987
* [CGP] avoid zext/trunc of a memcmp expansion compareSanjay Patel2017-06-071-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al llvm-svn: 304923
* [CGP] pass size as param in MemCmpExpansion; NFCISanjay Patel2017-06-071-10/+5
| | | | | | Avoid extracting the constant int twice. llvm-svn: 304920
* [CGP] pass size as param in MemCmpExpansion; NFCISanjay Patel2017-06-071-13/+8
| | | | | | Avoid extracting the constant int twice. llvm-svn: 304917
* [CGP] getParent()->getParent() --> getFunction(); NFCISanjay Patel2017-06-071-5/+4
| | | | llvm-svn: 304916
* [CGP] add helper function for generating compare of load pairs; NFCISanjay Patel2017-06-071-5/+16
| | | | | | | | In the special (but also the likely common) case, we can avoid the multi-block complexity of the general algorithm, so moving this part off on its own will make it re-usable. llvm-svn: 304908
* [CGP] fix formatting in MemCmpExpansion; NFCSanjay Patel2017-06-071-8/+6
| | | | llvm-svn: 304903
* [CGP / PowerPC] use direct compares if there's only one load per block in ↵Sanjay Patel2017-06-071-11/+19
| | | | | | | | | | | | | | | | | memcmp() expansion I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle. I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time limitation in the DAG. We might have to just avoid those special-cases here in CGP. But regardless of that, I think this is a win for the more general cases. http://rise4fun.com/Alive/cbQ Differential Revision: https://reviews.llvm.org/D33963 llvm-svn: 304849
* [CGP] fix formatting/typos in MemCmpExpansion; NFCSanjay Patel2017-06-061-36/+34
| | | | llvm-svn: 304830
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* [PPC] Inline expansion of memcmpZaara Syeda2017-05-311-1/+611
| | | | | | | | | | | | | | | This patch does an inline expansion of memcmp. It changes the memcmp library call into an inline expansion when the size is known at compile time and is under a target specified threshold. This expansion is implemented in CodeGenPrepare and expands into straight line code. The target specifies a maximum load size and the expansion works by using this size to load the two sources, compare, and exit early if a difference is found. It also has a special case when the memcmp result is used in a compare to zero equality. Differential Revision: https://reviews.llvm.org/D28637 llvm-svn: 304313
* CodeGen: Rename DEBUG_TYPE to match passnamesMatthias Braun2017-05-251-2/+2
| | | | | | | | Rename the DEBUG_TYPE to match the names of corresponding passes where it makes sense. Also establish the pattern of simply referencing DEBUG_TYPE instead of repeating the passname where possible. llvm-svn: 303921
* [IR] De-virtualize ~Value to save a vptrReid Kleckner2017-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Implements PR889 Removing the virtual table pointer from Value saves 1% of RSS when doing LTO of llc on Linux. The impact on time was positive, but too noisy to conclusively say that performance improved. Here is a link to the spreadsheet with the original data: https://docs.google.com/spreadsheets/d/1F4FHir0qYnV0MEp2sYYp_BuvnJgWlWPhWOwZ6LbW7W4/edit?usp=sharing This change makes it invalid to directly delete a Value, User, or Instruction pointer. Instead, such code can be rewritten to a null check and a call Value::deleteValue(). Value objects tend to have their lifetimes managed through iplist, so for the most part, this isn't a big deal. However, there are some places where LLVM deletes values, and those places had to be migrated to deleteValue. I have also created llvm::unique_value, which has a custom deleter, so it can be used in place of std::unique_ptr<Value>. I had to add the "DerivedUser" Deleter escape hatch for MemorySSA, which derives from User outside of lib/IR. Code in IR cannot include MemorySSA headers or call the MemoryAccess object destructors without introducing a circular dependency, so we need some level of indirection. Unfortunately, no class derived from User may have any virtual methods, because adding a virtual method would break User::getHungOffOperands(), which assumes that it can find the use list immediately prior to the User object. I've added a static_assert to the appropriate OperandTraits templates to help people avoid this trap. Reviewers: chandlerc, mehdi_amini, pete, dberlin, george.burgess.iv Reviewed By: chandlerc Subscribers: krytarowski, eraman, george.burgess.iv, mzolotukhin, Prazek, nlewycky, hans, inglorion, pcc, tejohnson, dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D31261 llvm-svn: 303362
* [LegacyPassManager] Remove TargetMachine constructorsFrancis Visoiu Mistrih2017-05-181-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This provides a new way to access the TargetMachine through TargetPassConfig, as a dependency. The patterns replaced here are: * Passes handling a null TargetMachine call `getAnalysisIfAvailable<TargetPassConfig>`. * Passes not handling a null TargetMachine `addRequired<TargetPassConfig>` and call `getAnalysis<TargetPassConfig>`. * MachineFunctionPasses now use MF.getTarget(). * Remove all the TargetMachine constructors. * Remove INITIALIZE_TM_PASS. This fixes a crash when running `llc -start-before prologepilog`. PEI needs StackProtector, which gets constructed without a TargetMachine by the pass manager. The StackProtector pass doesn't handle the case where there is no TargetMachine, so it segfaults. Related to PR30324. Differential Revision: https://reviews.llvm.org/D33222 llvm-svn: 303360
* [X86] Relocate code of replacement of subtarget unsupported masked memory ↵Ayman Musa2017-05-151-546/+0
| | | | | | | | | | | | | | intrinsics to run also on -O0 option. Currently, when masked load, store, gather or scatter intrinsics are used, we check in CodeGenPrepare pass if the subtarget support these intrinsics, if not we replace them with scalar code - this is a functional transformation not an optimization (not optional). CodeGenPrepare pass does not run when the optimization level is set to CodeGenOpt::None (-O0). Functional transformation should run with all optimization levels, so here I created a new pass which runs on all optimization levels and does no more than this transformation. Differential Revision: https://reviews.llvm.org/D32487 llvm-svn: 303050
* Fix code section prefix for proper layoutTeresa Johnson2017-05-091-1/+1
| | | | | | | | | | | | | | | | | | Summary: r284533 added hot and cold section prefixes based on profile information, to enable grouping of hot/cold functions at link time. However, it used "cold" as the prefix for cold sections, but gold only recognizes "unlikely" (which is used by gcc for cold sections). Therefore, cold sections were not properly being grouped. Switch to using "unlikely" Reviewers: danielcdh, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32983 llvm-svn: 302502
* Rename WeakVH to WeakTrackingVH; NFCSanjoy Das2017-05-011-4/+5
| | | | | | This relands r301424. llvm-svn: 301812
* Kill off the old SimplifyInstruction API by converting remaining users.Daniel Berlin2017-04-281-1/+1
| | | | llvm-svn: 301673
* Reverts commit r301424, r301425 and r301426Sanjoy Das2017-04-261-5/+5
| | | | | | | | | | | | Commits were: "Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts" "Add a new WeakVH value handle; NFC" "Rename WeakVH to WeakTrackingVH; NFC" The changes assumed pointers are 8 byte aligned on all architectures. llvm-svn: 301429
* Rename WeakVH to WeakTrackingVH; NFCSanjoy Das2017-04-261-5/+5
| | | | | | | | | | | | | | | | Summary: I plan to use WeakVH to mean "nulls itself out on deletion, but does not track RAUW" in a subsequent commit. Reviewers: dblaikie, davide Reviewed By: davide Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle Differential Revision: https://reviews.llvm.org/D32266 llvm-svn: 301424
* [APInt] Use lshrInPlace to replace lshr where possibleCraig Topper2017-04-181-4/+2
| | | | | | | | | | This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566
* [CodeGenPrepare] Fix crash due to an invalid CFGBrendon Cahoon2017-04-171-2/+8
| | | | | | | | | | | | | | | The splitIndirectCriticalEdges function generates and invalid CFG when the 'Target' basic block is a loop to itself. When this occurs, the code that updates the predecessor terminator needs to update the terminator in the split basic block. This occurs when there is an edge from block D back to D. Since D is split in to D0 and D1, the code needs to update the terminator in D1. But D1 is not in the OtherPreds vector, so it was not getting updated. Differential Revision: https://reviews.llvm.org/D32126 llvm-svn: 300480
* [IR] Redesign the case iterator in SwitchInst to actually be an iteratorChandler Carruth2017-04-121-1/+1
| | | | | | | | | | | | | | | | and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 llvm-svn: 300032
OpenPOWER on IntegriCloud