summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms/Scalar
Commit message (Collapse)AuthorAgeFilesLines
...
* [NaryReassociate] avoid running foreverJingyue Wu2015-05-131-4/+8
| | | | | | | | | Avoid running forever by checking we are not reassociating an expression into the same form. Tested with @avoid_infinite_loops in nary-add.ll llvm-svn: 237269
* Add function entry counts from sample profiles.Diego Novillo2015-05-131-0/+4
| | | | | | | | | | This patch uses the new function profile metadata "function_entry_count" to annotate entry counts from sample profiles. In a sampling profile, the total samples collected at the function entry are an approximation for the number of times that function was invoked. llvm-svn: 237265
* Constify arguments to methods in LICM. NFCPete Cooper2015-05-131-25/+33
| | | | llvm-svn: 237227
* Change LoadAndStorePromoter to take ArrayRef instead of SmallVectorImpl&.Pete Cooper2015-05-133-5/+8
| | | | | | | | | | The array passed to LoadAndStorePromoter's constructor was a constant reference to a SmallVectorImpl, which is just the same as passing an ArrayRef. Also, the data in the array can be 'const Instruction*' instead of 'Instruction*'. Its not possible to convert a SmallVectorImpl<T*> to SmallVectorImpl<const T*>, but ArrayRef does provide such a method. Currently this added calls to makeArrayRef which should be a nop, but i'm going to kick off a discussion about improving ArrayRef to not need these. llvm-svn: 237226
* [PlaceSafepoints] Reduce dominator tree recalculationPhilip Reames2015-05-131-42/+24
| | | | | | | | | | Reduce recalculation of the dominator tree by identifying all sites that will need a safepoint poll before doing any of the insertion. This allows us to invalidate the dominator info once, rather than once per safepoint poll inserted. While I'm at it, update findLocationForEntrySafepoint to properly update the dom tree now that the interface has been made easy. When first written, it wasn't per comment in the code. Differential Revision: http://reviews.llvm.org/D9727 llvm-svn: 237220
* [SLSR] handles non-canonicalized Mul candidatesJingyue Wu2015-05-131-2/+2
| | | | | | | | such as (2 + B) * S. Tested by @non_canonicalized in slsr-mul.ll llvm-svn: 237216
* [Statepoints] Support for "patchable" statepoints.Sanjoy Das2015-05-121-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This change adds two new parameters to the statepoint intrinsic, `i64 id` and `i32 num_patch_bytes`. `id` gets propagated to the ID field in the generated StackMap section. If the `num_patch_bytes` is non-zero then the statepoint is lowered to `num_patch_bytes` bytes of nops instead of a call (the spill and reload code remains unchanged). A non-zero `num_patch_bytes` is useful in situations where a language runtime requires complete control over how a call is lowered. This change brings statepoints one step closer to patchpoints. With some additional work (that is not part of this patch) it should be possible to get rid of `TargetOpcode::STATEPOINT` altogether. PlaceSafepoints generates `statepoint` wrappers with `id` set to `0xABCDEF00` (the old default value for the ID reported in the stackmap) and `num_patch_bytes` set to `0`. This can be made more sophisticated later. Reviewers: reames, pgavlin, swaroop.sridhar, AndyAyers Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9546 llvm-svn: 237214
* [PlaceSafepoints] Followup to commit L237172Philip Reames2015-05-121-10/+5
| | | | | | | | | Responding to review feedback from http://reviews.llvm.org/D9585 1) Remove a variable shadow by converting the outer loop to a range for loop. We never really used the 'i' variable which was being shadowed. 2) Reduce DominatorTree recalculations by passing the DT to SplitEdge. llvm-svn: 237212
* [Unrolling] Refactor the start and step offsets to simplify overflowChandler Carruth2015-05-121-10/+26
| | | | | | | | | | | | | | | | | | checking and make the cache faster and smaller. I had thought that using an APInt here would be useful, but I think I was just wrong. Notably, we don't have to do any fancy overflow checking, we can just bound the values as quite small and do the math in a higher precision integer. I've switched to a signed integer so that UBSan will even point out if we ever have integer overflow. I've added various asserts to try to catch things as well and hoisted the overflow checks so that we just leave the too-large offsets out of the SCEV-GEP cache. This makes the value in the cache quite a bit smaller which is probably worthwhile. No functionality changed here (for trip counts under 1 billion). llvm-svn: 237209
* CVP: Improve handling of Selects used as incoming PHI valuesBjorn Steinbrink2015-05-121-11/+30
| | | | | | | | | | | | | | Summary: If the branch that leads to the PHI node and the Select instruction depend on correlated conditions, we might be able to directly use the corresponding value from the Select instruction as the incoming value for the PHI node, allowing later removal of the select instruction. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9051 llvm-svn: 237201
* [RewriteStatepointsForGC] Extend base pointer to handle more cases w/vectorsPhilip Reames2015-05-121-9/+36
| | | | | | | | | | When relocating a pointer, we need to determine a base pointer for the derived pointer being relocated. We have limited support for handling a pointer extracted from a vector; the current code only handled the case where the entire vector was known to contain base pointers. This patch extends the reasoning to handle chains of insertelements where the indices are constants. This case turns out to be fairly common in vectorized code. We can now handle vectors which contains mixtures of base and derived pointers provided the insertelements use constant indices. Note that this doesn't solve the general problem. To handle variable indexed insertelements, we'd need to scalarize and introduce conditional branching based on the index. Alternatively, we could eagerly scalarize, but the code structure doesn't currently make either fix easy. The patch also doesn't handle shufflevector or other vector manipulation for much the same reasons. I plan to defer this work until I have a motivating test case. Differential Revision: http://reviews.llvm.org/D9676 llvm-svn: 237200
* [PlaceSafepoints] Add missing "override" to ↵Justin Bogner2015-05-121-2/+2
| | | | | | | | PlaceBackedgeSafepointsImpl::runOnFunction Pointed out by -Winconsistent-missing-override. llvm-svn: 237196
* [PlaceSafepoints] Switch to being a FunctionPassPhilip Reames2015-05-121-11/+6
| | | | | | | | The pass doesn't actually modify the module outside of the function being processed. The only confusing piece is that it both inserts calls and then inlines the resulting calls. Given that, it definitely invalidates module level analysis results, but many FunctionPasses do that. Differential Revision: http://reviews.llvm.org/D9590 llvm-svn: 237185
* [PlaceSafepoints] Make internal helper pass a FunctionPassPhilip Reames2015-05-121-11/+30
| | | | | | | | | | Switch from using a LoopPass to using a FunctionPass for the internal helper analysis pass. The next step is going to be to make this a true analysis pass which is required by the PlaceSafepoints pass itself. p.s. The interesting semantic part here is that we're changing the iteration order over the loops. It shouldn't matter, but that's the reason to separate this into it's own distinct patch. Differential Revision: http://reviews.llvm.org/D9588 llvm-svn: 237180
* [PlaceSafepoints] Use analysis infrastructure to get dominator treePhilip Reames2015-05-121-6/+4
| | | | | | | | The old code computed dominators for every loop. This was terribly slow with no good reason. Just use the standard infrastructure for analysis passes. Differential Revision: http://reviews.llvm.org/D9586 llvm-svn: 237176
* [PlaceSafepoints] Remove dependence on LoopSimplifyPhilip Reames2015-05-121-29/+40
| | | | | | | | As a step towards getting rid of internal pass manager hack entirely, remove the need for loop simplify to run in the inner pass manager. The new code does produce slightly different loop structures, so this isn't technically NFC. Differential Revision: http://reviews.llvm.org/D9585 llvm-svn: 237172
* Convert PHI getIncomingValue() to foreach over incoming_values(). NFC.Pete Cooper2015-05-121-2/+2
| | | | | | | | We already had a method to iterate over all the incoming values of a PHI. This just changes all eligible code to use it. Ineligible code included anything which cared about the index, or was also trying to get the i'th incoming BB. llvm-svn: 237169
* Constify method. NFCPete Cooper2015-05-121-1/+1
| | | | llvm-svn: 237167
* Reimplement heuristic for estimating complete-unroll optimization effects.Michael Zolotukhin2015-05-121-248/+300
| | | | | | | | | | | | | | | | | | | | Summary: This patch reimplements heuristic that tries to estimate optimization beneftis from complete loop unrolling. In this patch I kept the minimal changes - e.g. I removed code handling branches and folding compares. That's a promising area, but now there are too many questions to discuss before we can enable it. Test Plan: Tests are included in the patch. Reviewers: hfinkel, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8816 llvm-svn: 237156
* Rename variables in gc_relocate related functions to follow LLVM's naming ↵Sanjoy Das2015-05-111-39/+39
| | | | | | | | | | | | | | | | | | | | conventions. Summary: This patch is to rename some variables to CamelCase in gc_relocate related functions. There is no functionality change. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9681 llvm-svn: 237069
* [MemCpyOpt] Look at any dependency -not just source- for memset+memcpy.Ahmed Bougacha2015-05-111-6/+7
| | | | | | | | | | This fixes another miscompile introduced by r235232: when there was a dependency on the memcpy destination other than the memset, we would ignore it, because we only looked at the source dependency. It was a mistake to use SrcDepInfo. Instead, just use DepInfo. llvm-svn: 237066
* [LoopIdiomRecognize] Transform backedge-taken count check into an assertion.Davide Italiano2015-05-111-1/+3
| | | | | | | | | runOnCountable() allowed the caller to call on a loop without a predictable backedge-taken count. Change the code so that only loops with computable backdge-count can call this function, in order to catch abuses. llvm-svn: 237044
* [RewriteStatepointsForGC] Fix a bug on creating gc_relocate for pointer to ↵Sanjoy Das2015-05-111-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vector of pointers Summary: In RewriteStatepointsForGC pass, we create a gc_relocate intrinsic for each relocated pointer, and the gc_relocate has the same type with the pointer. During the creation of gc_relocate intrinsic, llvm requires to mangle its type. However, llvm does not support mangling of all possible types. RewriteStatepointsForGC will hit an assertion failure when it tries to create a gc_relocate for pointer to vector of pointers because mangling for vector of pointers is not supported. This patch changes the way RewriteStatepointsForGC pass creates gc_relocate. For each relocated pointer, we erase the type of pointers and create an unified gc_relocate of type i8 addrspace(1)*. Then a bitcast is inserted to convert the gc_relocate to the correct type. In this way, gc_relocate does not need to deal with different types of pointers and the unsupported type mangling is no longer a problem. This change would also ease further merge when LLVM erases types of pointers and introduces an unified pointer type. Some minor changes are also introduced to gc_relocate related part in InstCombineCalls, CodeGenPrepare, and Verifier accordingly. Patch by Chen Li! Reviewers: reames, AndyAyers, sanjoy Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9592 llvm-svn: 237009
* This change is refactoring only. It moves basic block normalization for ↵Igor Laevsky2015-05-081-39/+36
| | | | | | invokes to happen before replacement of a call with safepoint in "ReplaceWithStatepoint". Previously it was partly done before replacement of calls with safepoint and partly after call replacement but before RAUW's for gc_relocates, which was confusing. llvm-svn: 236829
* Scalar/PlaceSafepoints.cpp: Fix a warning introduced in r228090. ↵NAKAMURA Takumi2015-05-071-4/+2
| | | | | | [-Wunused-variable] llvm-svn: 236711
* [JumpThreading] Simplify comparisons when simplifying branchesPhilip Reames2015-05-071-0/+11
| | | | | | | | | | If we have recognized that a conditional is constant at a particular location in the code (while trying to decide if we can simplify a conditional branch), we can eagerly replace that condition with a constant if it's definition is post dominated by the branch in question. In practice, this ends up being a compile time savings at most. JumpThreading would have visited each using branch anyways. CVP would have visited the cmp itself again. Unless LVI gives up early, we shouldn't gain any addition power by doing this transformation early. What we do gain is simplicity and compile time. Differential Revision: http://reviews.llvm.org/D9312 llvm-svn: 236684
* [Statepoints] Clean up PlaceSafepoints.cpp: de-duplicate code.Sanjoy Das2015-05-061-21/+19
| | | | | | Common duplicated code and remove unnecessary code. llvm-svn: 236674
* [Statepoints] Clean up PlaceSafepoints.cpp: variable naming.Sanjoy Das2015-05-061-31/+29
| | | | | | Use CamelCase. NFC. llvm-svn: 236673
* [IRBuilder] Add a CreateGCStatepointInvoke.Sanjoy Das2015-05-061-32/+6
| | | | | | | | | | | Renames the original CreateGCStatepoint to CreateGCStatepointCall, and moves invoke creating functionality from PlaceSafepoints.cpp to IRBuilder.cpp. This changes the labels generated for PlaceSafepoints/invokes.ll so use a regex there to make the basic block labels more resilient. llvm-svn: 236672
* [DomTree] verifyDomTree to unconditionally perform DT verificationAdam Nemet2015-05-061-3/+2
| | | | | | | | | | | | I folded the check for the flag -verify-dom-info into the only caller where I think it is supposed to be checked: verifyAnalysis. (The idea of the flag is to enable this expensive verification in verifyPreservedAnalysis.) I'm assuming that when manually scheduling the verification pass with -passes=verify<domtree>, we do want to perform the verification. llvm-svn: 236575
* [Statepoint] Clean up Statepoint.h: accessor names.Sanjoy Das2015-05-061-1/+2
| | | | | | Use getFoo() as accessors consistently and some other naming changes. llvm-svn: 236564
* IR: Give 'DI' prefix to debug info metadataDuncan P. N. Exon Smith2015-04-293-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Finish off PR23080 by renaming the debug info IR constructs from `MD*` to `DI*`. The last of the `DIDescriptor` classes were deleted in r235356, and the last of the related typedefs removed in r235413, so this has all baked for about a week. Note: If you have out-of-tree code (like a frontend), I recommend that you get everything compiling and tests passing with the *previous* commit before updating to this one. It'll be easier to keep track of what code is using the `DIDescriptor` hierarchy and what you've already updated, and I think you're extremely unlikely to insert bugs. YMMV of course. Back to *this* commit: I did this using the rename-md-di-nodes.sh upgrade script I've attached to PR23080 (both code and testcases) and filtered through clang-format-diff.py. I edited the tests for test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns were off-by-three. It should work on your out-of-tree testcases (and code, if you've followed the advice in the previous paragraph). Some of the tests are in badly named files now (e.g., test/Assembler/invalid-mdcompositetype-missing-tag.ll should be 'dicompositetype'); I'll come back and move the files in a follow-up commit. llvm-svn: 236120
* [RewriteStatepointsForGC] Exclude constant values from being considered live ↵Philip Reames2015-04-261-14/+13
| | | | | | | | | | | | at a safepoint There can be various constant pointers in the IR which do not get relocated at a safepoint. One example is the address of a global variable. Another example is a pointer created via inttoptr. Note that the optimizer itself likes to create such inttoptrs when locally propagating constants through dynamically dead code. To deal with this, we need to exclude uses of constants from contributing to the liveness of a safepoint which might reach that use. At some later date, it might be worth exploring what could be done to support the relocation of various special types of "constants", but that's future work. Differential Revision: http://reviews.llvm.org/D9236 llvm-svn: 235821
* Don't Place Entry Safepoints Before the llvm.frameescape() IntrinsicPhilip Reames2015-04-261-0/+7
| | | | | | | | | llvm.frameescape() intrinsic is not a real call. The intrinsic can only exist in the entry block. Inserting a gc.statepoint() before llvm.frameescape() may split the entry block, and push the intrinsic out of the entry block. Patch by: Swaroop.Sridhar@microsoft.com Differential Revision: http://reviews.llvm.org/D8910 llvm-svn: 235820
* Fix LoopInterchange/reductions.ll test for debug buildsAndrew Kaylor2015-04-241-2/+2
| | | | llvm-svn: 235734
* Resurrect r235688Jingyue Wu2015-04-241-1/+2
| | | | | | | | We should skip vector types which are not SCEVable. test/CodeGen/NVPTX/sched2.ll passes llvm-svn: 235695
* Fix comment for NoCommonBits.Michael Zolotukhin2015-04-231-1/+2
| | | | | | | Maybe there is a better wording, but at least it should be technically correct now. llvm-svn: 235660
* Move Value.isDereferenceablePointer to ValueTracking [NFC]Philip Reames2015-04-232-6/+6
| | | | | | | | | | | Move isDereferenceablePointer function to Analysis. This function recursively tracks dereferencability over a chain of values like other functions in ValueTracking. This refactoring is motivated by further changes to support dereferenceable_or_null attribute (http://reviews.llvm.org/D8650). isDereferenceablePointer will be extended to perform context-sensitive analysis and IR is not a good place to have such functionality. Patch by: Artur Pilipenko <apilipenko@azulsystems.com> Differential Revision: reviews.llvm.org/D9075 llvm-svn: 235611
* Add support to interchange loops with reductions.Karthik Bhat2015-04-231-78/+224
| | | | | | | This patch enables interchanging of tightly nested loops with reductions. Differential Revision: http://reviews.llvm.org/D8314 llvm-svn: 235571
* don't repeat function names in comments; NFCSanjay Patel2015-04-221-38/+31
| | | | llvm-svn: 235531
* [MemCpyOpt] Use the raw i8* dest when optimizing memset+memcpy.Ahmed Bougacha2015-04-211-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | MemIntrinsic::getDest() looks through pointer casts, and using it directly when building the new GEP+memset results in stuff like: %0 = getelementptr i64* %p, i32 16 %1 = bitcast i64* %0 to i8* call ..memset(i8* %1, ...) instead of the correct: %0 = bitcast i64* %p to i8* %1 = getelementptr i8* %0, i32 16 call ..memset(i8* %1, ...) Instead, use getRawDest, which just gives you the i8* value. While there, use the memcpy's dest, as it's live anyway. In most cases, when the optimization triggers, the memset and memcpy sizes are the same, so the built memset is 0-sized and eliminated. The problem occurs when they're different. Fixes a regression caused by r235232: PR23300. llvm-svn: 235419
* Revamp PredIteratorCache interface to be cleaner.Daniel Berlin2015-04-211-3/+3
| | | | | | | | | | | | | Summary: This lets us use range based for loops. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9169 llvm-svn: 235416
* [LSR][NFC] Remove a stale comment.Sanjoy Das2015-04-211-3/+0
| | | | | | The comment was made stale in r171735. llvm-svn: 235414
* [SLSR] garbage-collect unused instructionsJingyue Wu2015-04-211-3/+13
| | | | | | | | | | | | | | | | | Summary: After we rewrite a candidate, the instructions used by the old form may become unused. This patch cleans up these unused instructions so that we needn't run DCE after SLSR. Test Plan: removed -dce in all the SLSR tests Reviewers: broune, meheff Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9101 llvm-svn: 235410
* [SeparateConstOffsetFromGEP] garbage-collect intermediate instructionsJingyue Wu2015-04-211-26/+65
| | | | | | | | | | | | | | Summary: so that we needn't run DCE after this pass. Test Plan: removed -dce from the commandline in split-gep.ll and split-gep-and-gvn.ll Reviewers: meheff Subscribers: llvm-commits, HaoLiu, hfinkel, jholewinski Differential Revision: http://reviews.llvm.org/D9096 llvm-svn: 235409
* DebugInfo: Drop rest of DIDescriptor subclassesDuncan P. N. Exon Smith2015-04-213-10/+10
| | | | | | | Delete the remaining subclasses of (the already deleted) `DIDescriptor`. Part of PR23080. llvm-svn: 235404
* [MemCpyOpt] Don't force i64 when promoting memset/memcpy sizes.Ahmed Bougacha2015-04-181-3/+6
| | | | | | | | | | Harden r235258 to support any integer bitwidth. The quick glance at the reference made me think only i32 and i64 were valid types, but they're not special, so any overload is legal. Thanks to David Majnemer for noticing! llvm-svn: 235261
* [MemCpyOpt] Promote both memset/memcpy sizes if differently typed.Ahmed Bougacha2015-04-181-0/+6
| | | | | | | | | | | | | Followup to r235232, which caused PR23278. We can't assume the memset and memcpy sizes have the same type, as nothing in the language reference prevents that. Instead, zext both to i64 if they disagree. While there, robustify tests by using i8 %c rather than i8 0 for the memset character. llvm-svn: 235258
* [MemCpyOpt] Optimize double-storing by memset+memcpy.Ahmed Bougacha2015-04-171-3/+59
| | | | | | | | | | | | | | | | | | A common idiom in some code is to do the following: memset(dst, 0, dst_size); memcpy(dst, src, src_size); Some of the memset is redundant; instead, we can do: memcpy(dst, src, src_size); memset(dst + src_size, 0, dst_size <= src_size ? 0 : dst_size - src_size); Original patch by: Joel Jones Differential Revision: http://reviews.llvm.org/D498 llvm-svn: 235232
* [NaryReassociate] run NaryReassociate iterativelyJingyue Wu2015-04-171-7/+47
| | | | | | | | | | | | | | | | | | | | | | | Summary: An alternative is to use a worklist approach. However, that approach would break the traversing order so that we couldn't lookup SeenExprs efficiently. I don't see a clear winner here, so I picked the easier approach. Along with two minor improvements: 1. preserves ScalarEvolution by forgetting instructions replaced 2. removes dead code locally avoiding the need of running DCE afterwards Test Plan: add to slsr-add.ll a test that requires multiple iterations Reviewers: broune, dberlin, atrick, meheff Reviewed By: atrick Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9058 llvm-svn: 235151
OpenPOWER on IntegriCloud