summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [LoopVer] Optionally allow using memchecks from LAAAdam Nemet2015-08-121-0/+9
| | | | | | | | | | | r243382 changed the behavior to always require a set of memchecks to be passed to LoopVer. This change restores the prior behavior as an alternative to the new behavior. This allows the checks to be implicitly taken from the LAA object. Patch by Ashutosh Nema! llvm-svn: 244763
* unused variable warning fix.Simon Pilgrim2015-08-121-1/+1
| | | | llvm-svn: 244725
* [InstCombine] Move SSE/AVX vector blend folding to instcombinerSimon Pilgrim2015-08-121-4/+15
| | | | | | | | | | | | As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723
* Fix PR24354.Sanjoy Das2015-08-111-3/+2
| | | | | | | | | | `InstCombiner::OptimizeOverflowCheck` was asserting an invariant (operands to binary operations are ordered by decreasing complexity) that wasn't really an invariant. Fix this by instead having `InstCombiner::OptimizeOverflowCheck` establish the invariant if it does not hold. llvm-svn: 244676
* don't repeat function names in comments; NFCSanjay Patel2015-08-111-39/+34
| | | | llvm-svn: 244672
* fix 80-cols; NFCSanjay Patel2015-08-111-19/+22
| | | | llvm-svn: 244668
* [LowerSwitch] Skip dead blocks for processSwitchInst()Chen Li2015-08-111-4/+10
| | | | | | | | | | | | Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time. Reviewers: reames, hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11953 llvm-svn: 244656
* [LowerSwitch] Fix a bug when LowerSwitch deletes the default blockChen Li2015-08-111-5/+10
| | | | | | | | | | | | Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks. Reviewers: kariddi, resistor, hans, reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11852 llvm-svn: 244642
* Enable EliminateAvailableExternally pass in the LTO pipeline.Teresa Johnson2015-08-111-0/+3
| | | | | | | | | | | | | | | Summary: For LTO we need to enable this pass in the LTO pipeline, as it is skipped during the "-flto -c" compile step (when PrepareForLTO is set). Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11919 llvm-svn: 244622
* Variable names should start with an upper case letter; NFCSanjay Patel2015-08-111-9/+9
| | | | llvm-svn: 244618
* fix minsize detection: minsize attribute implies optimizing for sizeSanjay Patel2015-08-111-7/+7
| | | | llvm-svn: 244617
* fix code that was accidentally commented out in previous commitSanjay Patel2015-08-111-2/+2
| | | | llvm-svn: 244610
* fix typos in comments; NFCSanjay Patel2015-08-111-5/+5
| | | | llvm-svn: 244609
* fix typo in comment; NFCSanjay Patel2015-08-111-1/+1
| | | | llvm-svn: 244607
* Add support for floating-point minnum and maxnumJames Molloy2015-08-113-11/+38
| | | | | | | | | | | | | | | | | The select pattern recognition in ValueTracking (as used by InstCombine and SelectionDAGBuilder) only knew about integer patterns. This teaches it about minimum and maximum operations. matchSelectPattern() has been extended to return a struct containing the existing Flavor and a new enum defining the pattern's behavior when given one NaN operand. C minnum() is defined to return the non-NaN operand in this case, but the idiomatic C "a < b ? a : b" would return the NaN operand. ARM and AArch64 at least have different instructions for these different cases. llvm-svn: 244580
* [WinEHPrepare] Add rudimentary support for the new EH instructionsDavid Majnemer2015-08-111-1/+1
| | | | | | | | | | | | | | | | | | This adds somewhat basic preparation functionality including: - Formation of funclets via coloring basic blocks. - Cloning of polychromatic blocks to ensure that funclets have unique program counters. - Demotion of values used between different funclets. - Some amount of cleanup once we have removed predecessors from basic blocks. - Verification that we are left with a CFG that makes some amount of sense. N.B. Arguments and numbering still need to be done. Differential Revision: http://reviews.llvm.org/D11750 llvm-svn: 244558
* Print vectorization analysis when loop hint is specified.Tyler Nowicki2015-08-111-16/+34
| | | | | | This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints. llvm-svn: 244555
* Moved LoopVectorizeHints and related functions before ↵Tyler Nowicki2015-08-111-270/+270
| | | | | | LoopVectorizationLegality and LoopVectorizationCostModel. llvm-svn: 244552
* Simplify processLoop() by moving loop hint verification into ↵Tyler Nowicki2015-08-111-26/+35
| | | | | | Hints::allowVectorization(). llvm-svn: 244550
* [libFuzzer] don't crash if the condition in a switch has unusual type (e.g. i72)Kostya Serebryany2015-08-111-0/+3
| | | | llvm-svn: 244544
* [LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFCAdam Nemet2015-08-112-2/+2
| | | | | | This was requested by Hal in D11205. llvm-svn: 244540
* [LoopVer] Remove unused pointer partition argument, NFC.Adam Nemet2015-08-101-2/+1
| | | | llvm-svn: 244527
* Extend late diagnostics to include late test for runtime pointer checks.Tyler Nowicki2015-08-101-14/+29
| | | | | | This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization. llvm-svn: 244523
* [InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombinerSimon Pilgrim2015-08-101-7/+31
| | | | | | | | As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495
* Late evaluation of the fast-math vectorization requirement.Tyler Nowicki2015-08-102-7/+70
| | | | | | This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. llvm-svn: 244489
* Modify diagnostic messages to clearly indicate the why interleaving wasn't done.Tyler Nowicki2015-08-101-22/+69
| | | | | | Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done. llvm-svn: 244485
* [IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarterIgor Laevsky2015-08-101-43/+8
| | | | | | Differential Revision: http://reviews.llvm.org/D11687 llvm-svn: 244474
* Add new llvm.loop.unroll.enable metadata.Mark Heffernan2015-08-101-20/+40
| | | | | | | | | | | | | This change adds the unroll metadata "llvm.loop.unroll.enable" which directs the optimizer to unroll a loop fully if the trip count is known at compile time, and unroll partially if the trip count is not known at compile time. This differs from "llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not known at compile time. The "llvm.loop.unroll.enable" is intended to be added for loops annotated with "#pragma unroll". llvm-svn: 244466
* [TTI] Add a hook for specifying per-target defaults for Interleaved AccessesSilviu Baranga2015-08-101-2/+8
| | | | | | | | | | | | | | | Summary: This adds a hook to TTI which enables us to selectively turn on by default interleaved access vectorization for targets on which we have have performed the required benchmarking. Reviewers: rengolin Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D11901 llvm-svn: 244449
* Prevent the scalarizer from caching incorrect entriesFraser Cormack2015-08-101-2/+8
| | | | | | | | | | | | | | The scalarizer can cache incorrect entries when walking up a chain of insertelement instructions. This occurs when it encounters more than one instruction that it is not actively searching for, as it unconditionally caches every element it finds. The fix is to only cache the first element that it isn't searching for so we don't overwrite correct entries. Reviewers: hfinkel Differential Revision: http://reviews.llvm.org/D11559 llvm-svn: 244448
* Fix some comment typos.Benjamin Kramer2015-08-0814-72/+73
| | | | llvm-svn: 244402
* [InstCombine] Don't try to sink EH pad instructionsDavid Majnemer2015-08-081-2/+2
| | | | | | | Found by inspection, this change should not effect the existing landingpad behavior. llvm-svn: 244391
* Remove unnecessary includesMatt Arsenault2015-08-081-1/+0
| | | | llvm-svn: 244382
* [LAA] Make the set of runtime checks part of the state of LAA, NFCAdam Nemet2015-08-071-1/+1
| | | | | | | | | | | | | | | | | | | | This is the full set of checks that clients can further filter. IOW, it's client-agnostic. This makes LAA complete in the sense that it now provides the two main results of its analysis precomputed: 1. memory dependences via getDepChecker().getInsterestingDependences() 2. run-time checks via getRuntimePointerCheck().getChecks() However, as a consequence we now compute this information pro-actively. Thus if the client decides to skip the loop based on the dependences we've computed the checks unnecessarily. In order to see whether this was a significant overhead I checked compile time on SPEC2k6 LTO bitcode files. The change was in the noise. The checks are generated in canCheckPtrAtRT, at the same place where we used to call groupChecks to merge checks. llvm-svn: 244368
* [ConstantFoldTerminator] Preserve make.implicit metadata when converting ↵Chen Li2015-08-071-0/+5
| | | | | | | | | | | | | | SwitchInst to BranchInst Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion. Reviewers: sanjoy, weimingz, chenli Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11841 llvm-svn: 244348
* [InstCombine] Fix SSE2/AVX2 vector logical shift by constantSimon Pilgrim2015-08-071-16/+39
| | | | | | | | | | | | | | | | This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) | 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341
* ValueMapper: Resolve uniquing cycles more aggressivelyDuncan P. N. Exon Smith2015-08-071-9/+14
| | | | | | | | | | | | As a follow-up to r244181, resolve uniquing cycles underneath distinct nodes on the fly. This prevents uniquing cycles in early operands from affecting later operands. It also removes an iteration through distinct nodes' operands. No real functional change here, just more prompt resolution of temporary nodes. llvm-svn: 244302
* ValueMapper: Pull out helper to resolve cycles, NFCDuncan P. N. Exon Smith2015-08-071-8/+10
| | | | | | | Pull out a helper for resolving uniquing cycles of `Metadata` to remove the boiler-plate of downcasting to `MDNode`. llvm-svn: 244301
* Revert accidentally committed WinEHPrepare changesDavid Majnemer2015-08-061-1/+1
| | | | | | This reverts commit r244272, r244273, r244274, and r244275. llvm-svn: 244278
* Handle PHI nodes prefacing EH pads tooDavid Majnemer2015-08-061-1/+1
| | | | llvm-svn: 244274
* [IndVars] Improved logging under DEBUG(); NFC.Sanjoy Das2015-08-061-6/+3
| | | | | | | Before this, we'd print the modified comparision in the "Simplified comparison" case. That looked misleading. llvm-svn: 244264
* Convert a bunch of loops to foreach. NFC.Pete Cooper2015-08-068-23/+16
| | | | | | | | After r244074, we now have a successors() method to iterate over all the successors of a TerminatorInst. This commit changes a bunch of eligible loops to use it. llvm-svn: 244260
* Rename inst_range() to instructions() for consistency. NFCNico Rieck2015-08-066-9/+9
| | | | llvm-svn: 244248
* [Reassociation] Fix miscompile for va_arg arguments.Quentin Colombet2015-08-061-22/+2
| | | | | | | | | | | | | | | | iisUnmovableInstruction() had a list of instructions hardcoded which are considered unmovable. The list lacked (at least) an entry for the va_arg and cmpxchg instructions. Fix this by introducing a new Instruction::mayBeMemoryDependent() instead of maintaining another instruction list. Patch by Matthias Braun <matze@braunis.de>. Differential Revision: http://reviews.llvm.org/D11577 rdar://problem/22118647 llvm-svn: 244244
* [PM/AA] Hoist the interface for BasicAA into a header file.Chandler Carruth2015-08-062-0/+2
| | | | | | | | | | | | | This is the first mechanical step in preparation for making this and all the other alias analysis passes available to the new pass manager. I'm factoring out all the totally boring changes I can so I'm moving code around here with no other changes. I've even minimized the formatting churn. I'll reformat and freshen comments on the interface now that its located in the right place so that the substantive changes don't triger this. llvm-svn: 244197
* [PM/AA] Simplify the AliasAnalysis interface by removing a wrapperChandler Carruth2015-08-061-1/+1
| | | | | | | | | | | | | | | | around a DataLayout interface in favor of directly querying DataLayout. This wrapper specifically helped handle the case where this no DataLayout, but LLVM now requires it simplifynig all of this. I've updated callers to directly query DataLayout. This in turn exposed a bunch of places where we should have DataLayout readily available but don't which I've fixed. This then in turn exposed that we were passing DataLayout around in a bunch of arguments rather than making it readily available so I've also fixed that. No functionality changed. llvm-svn: 244189
* ValueMapper: Rotate distinct node remapping algorithmDuncan P. N. Exon Smith2015-08-051-34/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rotate the algorithm for remapping distinct nodes in order to simplify how uniquing cycles get resolved. This removes some of the recursion, and, most importantly, exposes all uniquing cycles at the top-level. Besides being a little more efficient -- temporary MDNodes won't live as long -- the clearer logic should help protect against bugs like those fixed in r243961 and r243976. What are uniquing cycles? Why do they present challenges when remapping metadata? !0 = !{!1} !1 = !{!0} !0 and !1 form a simple uniquing cycle. When remapping from one metadata graph to another, every uniquing cycle gets "duplicated" through a dance: !0-temp = !{!1?} ; map(!0): clone !0, VM[!0] = !0-temp !1-temp = !{!0?} ; ..map(!1): clone !1, VM[!1] = !1-temp !1-temp = !{!0-temp} ; ..map(!1): remap !1's operands !2 = !{!0-temp} ; ..map(!1): uniquify: !1-temp => !2 !0-temp = !{!2} ; map(!0): remap !0's operands !3 = !{!2} ; map(!0): uniquify: !0-temp => !3 ; Result !2 = !{!3} !3 = !{!2} (In the two "uniquify" steps above, the operands of !X-temp are compared to the operands of !X. If they're the same, then !X-temp gets RAUW'ed to !X; if they're different, then !X-temp is promoted to a new unique node. The latter case always hits in for uniquing cycles, so we duplicate all the nodes involved.) Why is this a problem? Uniquable Metadata nodes that have temporary node as transitive operands keep RAUW support until the temporary nodes get finalized. With non-cycles, this happens automatically: when a uniquable node's count of unresolved operands drops to zero, it immediately sheds its own RAUW support (possibly triggering the same in any node that references it). However, uniquing cycles create a reference cycle, and uniqued nodes that transitively reference a uniquing cycle are "stuck" in an unresolved state until someone calls `MDNode::resolveCycles()` on a node in the unresolved subgraph. Distinct nodes should help here (and mostly do): since they aren't uniqued anywhere, they are guaranteed not to be RAUW'ed. They effectively form a barrier between uniqued nodes, breaking some uniquing cycles, and shielding uniqued nodes from uniquing cycles. Unfortunately, with this barrier in place, the unresolved subgraph(s) can be disjoint from the top-level node. The mapping algorithm needs to find at least one representative from each disjoint subgraph. But which nodes are *stuck*, and which will get resolved automatically? And which nodes are in the unresolved subgraph? The old logic was conservative. This commit rotates the logic for distinct nodes, so that we have access to unresolved nodes at the top-level call to `llvm::MapMetadata()`. Each time we return to the top-level, we know that all temporaries have been RAUW'ed away. Here, it's safe (and necessary) to call `resolveCycles()` immediately on unresolved operands. This should also perform better than the old algorithm. The recursion stack is shorter, temporary nodes don't live as long, and there are fewer tracking references to unresolved nodes. As the debug info graph introduces more 'distinct' nodes, remapping should incrementally get cheaper and cheaper. Aside from possible performance improvements (and reduced cruft in the `LLVMContext`), there should be no functionality change here. llvm-svn: 244181
* ValueMapper: Simplify remap() helper function, NFCDuncan P. N. Exon Smith2015-08-051-33/+22
| | | | | | | | | | Rename `remap()` to `remapOperands()`, and restrict its contract to remapping operands. Previously, it also called `mapToMetadata()`, but this logic is hard to reason about externally. In particular, this refactors `mapUniquedNode()` to avoid redundant mapping calls, taking advantage of the RAUWs that are already in place. llvm-svn: 244168
* [LoopUnswitch] Preserve make.implicit metadata for unswitched conditionsChen Li2015-08-051-0/+1
| | | | | | | | | | | | Summary: This patch adds support to preserve make.implicit metadata for unswitched conditions in loop pre-header. Reviewers: sanjoy, weimingz Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D11769 llvm-svn: 244132
* -Wdeprecated cleanup: Make CallGraph movable by default by using unique_ptr ↵David Blaikie2015-08-051-2/+2
| | | | | | | | | | members rather than raw pointers. The only place that tries to return a CallGraph by value (CallGraphAnalysis::run) doesn't seem to be used right now, but it's a reasonable bit of cleanup anyway. llvm-svn: 244122
OpenPOWER on IntegriCloud