summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Minor unroll pass refacoring.Evgeny Stupachenko2016-11-091-35/+29
| | | | | | | | | | | | | | | | Summary: Unrolled Loop Size calculations moved to a function. Constant representing number of optimized instructions when "back edge" becomes "fall through" replaced with variable. Some comments added. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D21719 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 286389
* Bitcode: Change the materializer interface to return llvm::Error.Peter Collingbourne2016-11-091-8/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D26439 llvm-svn: 286382
* Remove TimeValue usage from Scalar/SROA.cpp. NFC.Pavel Labath2016-11-091-2/+3
| | | | llvm-svn: 286361
* [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.Alexandros Lamprineas2016-11-091-3/+3
| | | | | | | | | | | Scalar Evolution asserts when not all the operands of an Add Recurrence Expression are loop invariants. Loop Strength Reduction should only create affine Add Recurrences, so that both the start and the step of the expression are loop invariants. Differential Revision: https://reviews.llvm.org/D26185 llvm-svn: 286347
* Enable Loop Sink pass for functions that has profile.Dehao Chen2016-11-092-4/+9
| | | | | | | | | | | | Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code. Reviewers: davidxl, hfinkel Subscribers: mehdi_amini, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26155 llvm-svn: 286325
* [InstCombine] fix profitability equation for max-of-nots transformSanjay Patel2016-11-091-7/+6
| | | | | | | | | | As the test change shows, we can increase the critical path by adding a 'not' instruction, so make sure that we're actually removing an instruction if we do this transform. This transform could also cause us to miss folds of min/max pairs. llvm-svn: 286315
* [InstCombine] reduce indentation; NFCSanjay Patel2016-11-081-23/+20
| | | | llvm-svn: 286314
* [asan] Speed up compilation of large C++ stringmaps (tons of allocas) with ASanKuba Brecka2016-11-081-6/+12
| | | | | | | | This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst. Differential Revision: https://reviews.llvm.org/D26380 llvm-svn: 286296
* [LoopDistribute] Preserve GlobalsAA also in the new Pass Manager.Davide Italiano2016-11-081-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D26408 llvm-svn: 286280
* [LibcallsShrinkWrap] This pass doesn't preserve the CFG.Davide Italiano2016-11-081-2/+5
| | | | | | | | | | For example, it invalidates the domtree, causing assertions in later passes which need dominator infos. Make it preserve GlobalsAA, as suggested by Eli. Differential Revision: https://reviews.llvm.org/D26381 llvm-svn: 286271
* Fix typo in comment. NFC.Chad Rosier2016-11-081-2/+2
| | | | llvm-svn: 286270
* Remove unused include. NFC.Chad Rosier2016-11-081-1/+0
| | | | llvm-svn: 286250
* Use the last 7 bits to represent the discriminator to fit it in 1 byte ↵Dehao Chen2016-11-081-5/+7
| | | | | | | | ULEB128 (NFC). From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte. llvm-svn: 286245
* [JumpThreading] Unfold selects that depend on the same conditionPablo Barrio2016-11-081-38/+77
| | | | | | | | | | | | | | | | | | Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: rengolin, haicheng, sebpop Subscribers: jojo, jmolloy, llvm-commits Differential Revision: https://reviews.llvm.org/D26391 llvm-svn: 286236
* [TRE] Remove dead codeSanjoy Das2016-11-071-1/+0
| | | | | | Address review by Eli Friedman on rL286147. llvm-svn: 286165
* Reset debug loc to OldInduction in ↵Dehao Chen2016-11-071-1/+3
| | | | | | | | InnerLoopVectorizer::createInductionVariable. (NFC) This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator(). llvm-svn: 286159
* Avoid tail recursion elimination across calls with operand bundlesSanjoy Das2016-11-071-1/+2
| | | | | | | | | | | | | | | | | | | | Summary: In some specific scenarios with well understood operand bundle types (like `"deopt"`) it may be possible to go ahead and convert recursion to iteration, but TailRecursionElimination does not have that logic today so avoid doing the right thing for now. I need some input on whether `"funclet"` operand bundles should also block tail recursion elimination. If not, I'll allow TRE across calls with `"funclet"` operand bundles and add a test case. Reviewers: rnk, majnemer, nlewycky, ahatanak Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26270 llvm-svn: 286147
* Use -fsanitize-recover instead of -mllvm -msan-keep-going.Evgeniy Stepanov2016-11-071-9/+11
| | | | | | | | | | | | Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26352 llvm-svn: 286145
* [tsan] Cast floating-point types correctly when instrumenting atomic ↵Kuba Brecka2016-11-071-17/+6
| | | | | | | | | | accesses, LLVM part Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this. Differential Revision: https://reviews.llvm.org/D26266 llvm-svn: 286135
* [MemCpyOpt] Don't emit IR in an unspecified orderBenjamin Kramer2016-11-071-4/+4
| | | | | | | | | Argument evaluation order is one of the edge cases where Clang differs from GCC, yielding different IR depending on which compiler LLVM was built with. Make the order deterministic and tune the test to actually verify the order instead of trying to hide it. llvm-svn: 286126
* Fix 80-column violations. NFC.Chad Rosier2016-11-071-3/+7
| | | | llvm-svn: 286117
* [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)Sanjay Patel2016-11-071-14/+12
| | | | | | | | This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113
* [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any ↵Justin Lebar2016-11-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | valid int64_t to the set. Summary: SmallSetVector uses DenseSet, but that means we need to reserve some values for the empty and tombstone keys. It seems to me we should have a general way to let us store full-range ints inside of DenseSets, and furthermore that we probably shouldn't silently let you add ints into DenseSets without explicitly promising that they're in range. But that's a battle for another day; for now, just fix this code, since we currently do something Very Bad when compiling ffmpeg. Fixes PR30914. Reviewers: jeremyhu Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26323 llvm-svn: 286038
* Only log the visit of a return instruction if we in fact found a returnChandler Carruth2016-11-041-3/+2
| | | | | | | | | | | | | | | | | | instruction. This avoids dereferencing null in the debug logging if the instruction was not in fact a return instruction. This potential bug was found by PVS-Studio. This actually fixes the last of the "dereferenced a pointer before checking it for null" reports in the recent PVS-Studio run. However, there are quite a few reports of this nature that I did not do anything to fix because they are pretty glaring false positives. They usually took the form of quite clear correlated checks or a check made in a separate function. I've even added asserts anywhere this correlation wasn't pretty obvious and fundamental to the code. llvm-svn: 285988
* Fix typoXinliang David Li2016-11-041-1/+1
| | | | llvm-svn: 285978
* Fix a bug found by inspection by PVS-Studio.Chandler Carruth2016-11-031-1/+1
| | | | | | | | | | | This condition is trivially always true prior to the change. The comment at the call site makes it clear that we expect *all* of these to be '=', 'S', or 'I' so fix the code. We have a bug I will update to track the fact that Clang doesn't warn on this: http://llvm.org/PR13101 llvm-svn: 285930
* [ThinLTO] Handle distributed backend case when doing renamingTeresa Johnson2016-11-031-4/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: The recent change I made to consult the summary when deciding whether to rename (to handle inline asm) in r285513 broke the distributed build case. In a distributed backend we will only have a portion of the combined index, specifically for imported modules we only have the summaries for any imported definitions. When renaming on import we were asserting because no summary entry was found for a local reference being linked in (def wasn't imported). We only need to consult the summary for a renaming decision for the exporting module. For imports, we would have prevented importing any references to NoRename values already. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26250 llvm-svn: 285871
* Revert "[InstCombine] allow splat vector folds in adjustMinMax()"Greg Bedwell2016-11-021-10/+14
| | | | | | | | | | | | | | | | | This reverts commit r285732. This change introduced a new assertion failure in the following testcase at -O2: typedef short __v8hi __attribute__((__vector_size__(16))); __v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) { __v8hi Result = V1; if (mask & 0x80) Result[0] = V2[0]; return Result; } llvm-svn: 285866
* DCE math library calls with a constant operand.Eli Friedman2016-11-021-0/+4
| | | | | | | | | | | On platforms which use -fmath-errno, math libcalls without any uses require some extra checks to figure out if they are actually dead. Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 . Differential Revision: https://reviews.llvm.org/D25970 llvm-svn: 285857
* [Reassociate] Skip analysis of dead code to avoid infinite loop.Bjorn Pettersson2016-11-021-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | Summary: It was detected that the reassociate pass could enter an inifite loop when analysing dead code. Simply skipping to analyse basic blocks that are dead avoids such problems (and as a side effect we avoid spending time on optimising dead code). The solution is using the same Reverse Post Order ordering of the basic blocks when doing the optimisations, as when building the precalculated rank map. A nice side-effect of this solution is that we now know that we only try to do optimisations for blocks with ranked instructions. Fixes https://llvm.org/bugs/show_bug.cgi?id=30818 Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini Subscribers: dberlin Differential Revision: https://reviews.llvm.org/D26154 llvm-svn: 285793
* [MemorySSA] Tighten up types to make our API prettier. NFC.George Burgess IV2016-11-012-17/+15
| | | | | | | | Patch by bryant. Differential Revision: https://reviews.llvm.org/D26126 llvm-svn: 285750
* [InstCombine] allow splat vector folds in adjustMinMax()Sanjay Patel2016-11-011-14/+10
| | | | llvm-svn: 285732
* [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.Sanjay Patel2016-11-011-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transforms %a = shl nuw %x, c1 %b = icmp {ugt|ule} %a, c0 into %b = icmp {ugt|ule} %x, (c0 >> c1) z3: (declare-const x (_ BitVec 64)) (declare-const c0 (_ BitVec 64)) (declare-const c1 (_ BitVec 64)) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvugt (bvshl x c1) c0) (bvugt x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvule (bvshl x c1) c0) (bvule x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) Patch by bryant! Differential Revision: https://reviews.llvm.org/D25913 llvm-svn: 285729
* [InstCombine] clean up adjustMinMax(); NFCISanjay Patel2016-11-011-92/+87
| | | | | | | | | 1. Change param names for readability 2. Change pointer param to ref 3. Early exit to reduce indent 4. Change switch to if/else llvm-svn: 285718
* [InstCombine] add helper function for adjustMinMax(); NFCISanjay Patel2016-11-011-6/+19
| | | | | | This is just a cut and paste; clean-up and enhancements to follow. llvm-svn: 285715
* [InstCombine] Folding of shifts by the sum of positive valuesSimon Pilgrim2016-11-011-1/+10
| | | | | | | | | | | | | | | This patch introduces the combine: (C1 shift (A add C2)) -> ((C1 shift C2) shift A) iff A and C2 are both positive If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift. Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive). Differential Revision: https://reviews.llvm.org/D26000 llvm-svn: 285696
* Fix a typo.Evgeniy Stepanov2016-10-311-1/+1
| | | | | | Found with PVS-Studio here: http://www.viva64.com/en/b/0446/ llvm-svn: 285652
* [asan] Move instrumented null-terminated strings to a special section, LLVM partKuba Brecka2016-10-311-0/+8
| | | | | | | | On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings. Differential Revision: https://reviews.llvm.org/D25026 llvm-svn: 285619
* Second attempt at r285517.Dorit Nuzman2016-10-311-11/+65
| | | | llvm-svn: 285568
* Revert r285517 due to build failures.Dorit Nuzman2016-10-301-59/+1
| | | | llvm-svn: 285518
* [LoopVectorize] Make interleaved-accesses analysis less conservative aboutDorit Nuzman2016-10-301-1/+59
| | | | | | | | | | | | | | | | | | | | | possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517
* [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asmTeresa Johnson2016-10-301-8/+0
| | | | | | | | | | | | | | | | | | | | | Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513
* [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)Teresa Johnson2016-10-291-3/+3
| | | | | | Rename as suggested in code review for D26063. llvm-svn: 285508
* [ThinLTO] Use NoPromote flag in summary during promotionTeresa Johnson2016-10-291-13/+19
| | | | | | | | | | | | | | | | Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507
* [InstCombine] re-use bitcasted compare operands in selects (PR28001)Sanjay Patel2016-10-291-0/+50
| | | | | | | | | | | These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495
* Don't leave unused divs/rems sitting around in BypassSlowDivision.Justin Lebar2016-10-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This "pass" eagerly creates div and rem instructions even when only one is needed -- it relies on a later pass (machine DCE?) to clean them up. This is problematic not just from a cleanliness perspective (this pass is running during CodeGenPrepare, so should leave the IR in a better state), but it also creates a problem for instruction selection. If we always have a div+rem, isel will always select a divrem instruction (if possible), even when a single div or rem would do. Specifically, in NVPTX, we want to compute rem from the output of div, if available. But if a div is not available, we want to leave the rem alone. This transformation is overeager if div is always available. Because this code runs as part of CodeGenPrepare, it's nontrivial to write a test for this change. But this will effectively be tested by a later patch which adds the aforementioned change to NVPTX isel. Reviewers: tra Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26088 llvm-svn: 285460
* Don't claim the udiv created in BypassSlowDivision is exact.Justin Lebar2016-10-281-2/+1
| | | | | | | | | | | | | | | | | | | Summary: In BypassSlowDivision's short-dividend path, we would create e.g. udiv exact i32 %a, %b "exact" here means that we are asserting that %a is a multiple of %b. But we have no reason to believe this must be true -- this is just a bug, as far as I can tell. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D26097 llvm-svn: 285459
* SpeculativeExecution: Allow speculating more inst typesMatt Arsenault2016-10-281-0/+18
| | | | | | | Partial step towards removing the whitelist and only using TTI's cost. llvm-svn: 285438
* [MemorySSA] Add const to getClobberingMemoryAccess.George Burgess IV2016-10-281-3/+3
| | | | | | | | Thanks to bryant for the patch! Differential Revision: https://reviews.llvm.org/D26086 llvm-svn: 285432
* [LCSSA] Perform LCSSA verification only for the current loop nest.Igor Laevsky2016-10-282-4/+30
| | | | | | | | | Now LPPassManager will run LCSSA verification only for the top-level loop which was processed on the current iteration. Differential Revision: https://reviews.llvm.org/D25873 llvm-svn: 285394
OpenPOWER on IntegriCloud