summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [cfi] Implement cfi-icall using inline assembly.Evgeniy Stepanov2016-11-111-75/+159
| | | | | | | | | | | | | | | | | | | | | | | | | The current implementation is emitting a global constant that happens to evaluate to the same bytes + relocation as a jump instruction on X86. This does not work for PIE executables and shared libraries though, because we end up with a wrong relocation type. And it has no chance of working on ARM/AArch64 which use different relocation types for jump instructions (R_ARM_JUMP24) that is never generated for data. This change replaces the constant with module-level inline assembly followed by a hidden declaration of the jump table. Works fine for ARM/AArch64, but has some drawbacks. * Extra symbols are added to the static symbol table, which inflate the size of the unstripped binary a little. Stripped binaries are not affected. This happens because jump table declarations must be external (because their body is in the inline asm). * Original functions that were anonymous are now named <original name>.cfi, and it affects symbolization sometimes. This is necessary because the only user of these functions is the (inline asm) jump table, so they had to be added to @llvm.used, which does not allow unnamed functions. llvm-svn: 286611
* [InstCombine] fix formatting of FoldOpIntoSelect(); NFCISanjay Patel2016-11-111-41/+43
| | | | llvm-svn: 286604
* Add comments about why we put LoopSink pass at the very late stage.Dehao Chen2016-11-101-0/+4
| | | | llvm-svn: 286480
* [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)Sanjay Patel2016-11-101-0/+8
| | | | | | | | | | | | Removing the limitation in visitInsertElementInst() causes several regressions because we're not prepared to fold sequences of shuffles or inserts and extracts separated by shuffles. Fixing that appears to be a difficult mission because we are purposely trying to avoid creating shuffles with arbitrary shuffle masks because some targets may choke on those. https://llvm.org/bugs/show_bug.cgi?id=30923 llvm-svn: 286423
* Preserve assumption cache in loop-rotate.Eli Friedman2016-11-091-0/+4
| | | | | | | | | | | No testcase included because I can't figure out how to reduce it. (It's easy to write a testcase where rotation clones an assume, but that doesn't actually seem to trigger the crash in opt on its own; maybe an issue with the laziness?) Differential Revision: https://reviews.llvm.org/D26434 llvm-svn: 286410
* Minor unroll pass refacoring.Evgeny Stupachenko2016-11-091-35/+29
| | | | | | | | | | | | | | | | Summary: Unrolled Loop Size calculations moved to a function. Constant representing number of optimized instructions when "back edge" becomes "fall through" replaced with variable. Some comments added. Reviewers: mzolotukhin Differential Revision: http://reviews.llvm.org/D21719 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 286389
* Bitcode: Change the materializer interface to return llvm::Error.Peter Collingbourne2016-11-091-8/+22
| | | | | | Differential Revision: https://reviews.llvm.org/D26439 llvm-svn: 286382
* Remove TimeValue usage from Scalar/SROA.cpp. NFC.Pavel Labath2016-11-091-2/+3
| | | | llvm-svn: 286361
* [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.Alexandros Lamprineas2016-11-091-3/+3
| | | | | | | | | | | Scalar Evolution asserts when not all the operands of an Add Recurrence Expression are loop invariants. Loop Strength Reduction should only create affine Add Recurrences, so that both the start and the step of the expression are loop invariants. Differential Revision: https://reviews.llvm.org/D26185 llvm-svn: 286347
* Enable Loop Sink pass for functions that has profile.Dehao Chen2016-11-092-4/+9
| | | | | | | | | | | | Summary: For functions with profile data, we are confident that loop sink will be optimal in sinking code. Reviewers: davidxl, hfinkel Subscribers: mehdi_amini, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D26155 llvm-svn: 286325
* [InstCombine] fix profitability equation for max-of-nots transformSanjay Patel2016-11-091-7/+6
| | | | | | | | | | As the test change shows, we can increase the critical path by adding a 'not' instruction, so make sure that we're actually removing an instruction if we do this transform. This transform could also cause us to miss folds of min/max pairs. llvm-svn: 286315
* [InstCombine] reduce indentation; NFCSanjay Patel2016-11-081-23/+20
| | | | llvm-svn: 286314
* [asan] Speed up compilation of large C++ stringmaps (tons of allocas) with ASanKuba Brecka2016-11-081-6/+12
| | | | | | | | This addresses PR30746, <https://llvm.org/bugs/show_bug.cgi?id=30746>. The ASan pass iterates over entry-block instructions and checks each alloca whether it's in NonInstrumentedStaticAllocaVec, which is apparently slow. This patch gathers the instructions to move during visitAllocaInst. Differential Revision: https://reviews.llvm.org/D26380 llvm-svn: 286296
* [LoopDistribute] Preserve GlobalsAA also in the new Pass Manager.Davide Italiano2016-11-081-0/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D26408 llvm-svn: 286280
* [LibcallsShrinkWrap] This pass doesn't preserve the CFG.Davide Italiano2016-11-081-2/+5
| | | | | | | | | | For example, it invalidates the domtree, causing assertions in later passes which need dominator infos. Make it preserve GlobalsAA, as suggested by Eli. Differential Revision: https://reviews.llvm.org/D26381 llvm-svn: 286271
* Fix typo in comment. NFC.Chad Rosier2016-11-081-2/+2
| | | | llvm-svn: 286270
* Remove unused include. NFC.Chad Rosier2016-11-081-1/+0
| | | | llvm-svn: 286250
* Use the last 7 bits to represent the discriminator to fit it in 1 byte ↵Dehao Chen2016-11-081-5/+7
| | | | | | | | ULEB128 (NFC). From experiments, discriminator is rarely greater than 127. Here we enforce it to be no greater than 127 so that it will always fit in 1 byte. llvm-svn: 286245
* [JumpThreading] Unfold selects that depend on the same conditionPablo Barrio2016-11-081-38/+77
| | | | | | | | | | | | | | | | | | Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: rengolin, haicheng, sebpop Subscribers: jojo, jmolloy, llvm-commits Differential Revision: https://reviews.llvm.org/D26391 llvm-svn: 286236
* [TRE] Remove dead codeSanjoy Das2016-11-071-1/+0
| | | | | | Address review by Eli Friedman on rL286147. llvm-svn: 286165
* Reset debug loc to OldInduction in ↵Dehao Chen2016-11-071-1/+3
| | | | | | | | InnerLoopVectorizer::createInductionVariable. (NFC) This is to prevent SetInsertionPoint from setting debug loc to Latch->getTerminator(). llvm-svn: 286159
* Avoid tail recursion elimination across calls with operand bundlesSanjoy Das2016-11-071-1/+2
| | | | | | | | | | | | | | | | | | | | Summary: In some specific scenarios with well understood operand bundle types (like `"deopt"`) it may be possible to go ahead and convert recursion to iteration, but TailRecursionElimination does not have that logic today so avoid doing the right thing for now. I need some input on whether `"funclet"` operand bundles should also block tail recursion elimination. If not, I'll allow TRE across calls with `"funclet"` operand bundles and add a test case. Reviewers: rnk, majnemer, nlewycky, ahatanak Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26270 llvm-svn: 286147
* Use -fsanitize-recover instead of -mllvm -msan-keep-going.Evgeniy Stepanov2016-11-071-9/+11
| | | | | | | | | | | | Summary: Use -fsanitize-recover instead of -mllvm -msan-keep-going. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26352 llvm-svn: 286145
* [tsan] Cast floating-point types correctly when instrumenting atomic ↵Kuba Brecka2016-11-071-17/+6
| | | | | | | | | | accesses, LLVM part Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this. Differential Revision: https://reviews.llvm.org/D26266 llvm-svn: 286135
* [MemCpyOpt] Don't emit IR in an unspecified orderBenjamin Kramer2016-11-071-4/+4
| | | | | | | | | Argument evaluation order is one of the edge cases where Clang differs from GCC, yielding different IR depending on which compiler LLVM was built with. Make the order deterministic and tune the test to actually verify the order instead of trying to hide it. llvm-svn: 286126
* Fix 80-column violations. NFC.Chad Rosier2016-11-071-3/+7
| | | | llvm-svn: 286117
* [InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)Sanjay Patel2016-11-071-14/+12
| | | | | | | | This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113
* [LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any ↵Justin Lebar2016-11-051-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | valid int64_t to the set. Summary: SmallSetVector uses DenseSet, but that means we need to reserve some values for the empty and tombstone keys. It seems to me we should have a general way to let us store full-range ints inside of DenseSets, and furthermore that we probably shouldn't silently let you add ints into DenseSets without explicitly promising that they're in range. But that's a battle for another day; for now, just fix this code, since we currently do something Very Bad when compiling ffmpeg. Fixes PR30914. Reviewers: jeremyhu Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26323 llvm-svn: 286038
* Only log the visit of a return instruction if we in fact found a returnChandler Carruth2016-11-041-3/+2
| | | | | | | | | | | | | | | | | | instruction. This avoids dereferencing null in the debug logging if the instruction was not in fact a return instruction. This potential bug was found by PVS-Studio. This actually fixes the last of the "dereferenced a pointer before checking it for null" reports in the recent PVS-Studio run. However, there are quite a few reports of this nature that I did not do anything to fix because they are pretty glaring false positives. They usually took the form of quite clear correlated checks or a check made in a separate function. I've even added asserts anywhere this correlation wasn't pretty obvious and fundamental to the code. llvm-svn: 285988
* Fix typoXinliang David Li2016-11-041-1/+1
| | | | llvm-svn: 285978
* Fix a bug found by inspection by PVS-Studio.Chandler Carruth2016-11-031-1/+1
| | | | | | | | | | | This condition is trivially always true prior to the change. The comment at the call site makes it clear that we expect *all* of these to be '=', 'S', or 'I' so fix the code. We have a bug I will update to track the fact that Clang doesn't warn on this: http://llvm.org/PR13101 llvm-svn: 285930
* [ThinLTO] Handle distributed backend case when doing renamingTeresa Johnson2016-11-031-4/+19
| | | | | | | | | | | | | | | | | | | | | | | Summary: The recent change I made to consult the summary when deciding whether to rename (to handle inline asm) in r285513 broke the distributed build case. In a distributed backend we will only have a portion of the combined index, specifically for imported modules we only have the summaries for any imported definitions. When renaming on import we were asserting because no summary entry was found for a local reference being linked in (def wasn't imported). We only need to consult the summary for a renaming decision for the exporting module. For imports, we would have prevented importing any references to NoRename values already. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26250 llvm-svn: 285871
* Revert "[InstCombine] allow splat vector folds in adjustMinMax()"Greg Bedwell2016-11-021-10/+14
| | | | | | | | | | | | | | | | | This reverts commit r285732. This change introduced a new assertion failure in the following testcase at -O2: typedef short __v8hi __attribute__((__vector_size__(16))); __v8hi foo(__v8hi &V1, __v8hi &V2, unsigned mask) { __v8hi Result = V1; if (mask & 0x80) Result[0] = V2[0]; return Result; } llvm-svn: 285866
* DCE math library calls with a constant operand.Eli Friedman2016-11-021-0/+4
| | | | | | | | | | | On platforms which use -fmath-errno, math libcalls without any uses require some extra checks to figure out if they are actually dead. Fixes https://llvm.org/bugs/show_bug.cgi?id=30464 . Differential Revision: https://reviews.llvm.org/D25970 llvm-svn: 285857
* [Reassociate] Skip analysis of dead code to avoid infinite loop.Bjorn Pettersson2016-11-021-4/+13
| | | | | | | | | | | | | | | | | | | | | | | | Summary: It was detected that the reassociate pass could enter an inifite loop when analysing dead code. Simply skipping to analyse basic blocks that are dead avoids such problems (and as a side effect we avoid spending time on optimising dead code). The solution is using the same Reverse Post Order ordering of the basic blocks when doing the optimisations, as when building the precalculated rank map. A nice side-effect of this solution is that we now know that we only try to do optimisations for blocks with ranked instructions. Fixes https://llvm.org/bugs/show_bug.cgi?id=30818 Reviewers: llvm-commits, davide, eli.friedman, mehdi_amini Subscribers: dberlin Differential Revision: https://reviews.llvm.org/D26154 llvm-svn: 285793
* [MemorySSA] Tighten up types to make our API prettier. NFC.George Burgess IV2016-11-012-17/+15
| | | | | | | | Patch by bryant. Differential Revision: https://reviews.llvm.org/D26126 llvm-svn: 285750
* [InstCombine] allow splat vector folds in adjustMinMax()Sanjay Patel2016-11-011-14/+10
| | | | llvm-svn: 285732
* [InstCombine] Fold nuw left-shifts in `ugt`/`ule` comparisons.Sanjay Patel2016-11-011-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This transforms %a = shl nuw %x, c1 %b = icmp {ugt|ule} %a, c0 into %b = icmp {ugt|ule} %x, (c0 >> c1) z3: (declare-const x (_ BitVec 64)) (declare-const c0 (_ BitVec 64)) (declare-const c1 (_ BitVec 64)) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvugt (bvshl x c1) c0) (bvugt x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) (push) (assert (= x (bvlshr (bvshl x c1) c1))) ; nuw (assert (not (= (bvule (bvshl x c1) c0) (bvule x (bvlshr c0 c1))))) (check-sat) (get-model) (pop) Patch by bryant! Differential Revision: https://reviews.llvm.org/D25913 llvm-svn: 285729
* [InstCombine] clean up adjustMinMax(); NFCISanjay Patel2016-11-011-92/+87
| | | | | | | | | 1. Change param names for readability 2. Change pointer param to ref 3. Early exit to reduce indent 4. Change switch to if/else llvm-svn: 285718
* [InstCombine] add helper function for adjustMinMax(); NFCISanjay Patel2016-11-011-6/+19
| | | | | | This is just a cut and paste; clean-up and enhancements to follow. llvm-svn: 285715
* [InstCombine] Folding of shifts by the sum of positive valuesSimon Pilgrim2016-11-011-1/+10
| | | | | | | | | | | | | | | This patch introduces the combine: (C1 shift (A add C2)) -> ((C1 shift C2) shift A) iff A and C2 are both positive If both A and C2 are know to be positive then we can safely split into 2 shifts, permitting the folding of the Inner shift. Fix for the spec benchmark case mentioned by @nadav on PR15141 (assuming we can prove that the inputs as positive). Differential Revision: https://reviews.llvm.org/D26000 llvm-svn: 285696
* Fix a typo.Evgeniy Stepanov2016-10-311-1/+1
| | | | | | Found with PVS-Studio here: http://www.viva64.com/en/b/0446/ llvm-svn: 285652
* [asan] Move instrumented null-terminated strings to a special section, LLVM partKuba Brecka2016-10-311-0/+8
| | | | | | | | On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings. Differential Revision: https://reviews.llvm.org/D25026 llvm-svn: 285619
* Second attempt at r285517.Dorit Nuzman2016-10-311-11/+65
| | | | llvm-svn: 285568
* Revert r285517 due to build failures.Dorit Nuzman2016-10-301-59/+1
| | | | llvm-svn: 285518
* [LoopVectorize] Make interleaved-accesses analysis less conservative aboutDorit Nuzman2016-10-301-1/+59
| | | | | | | | | | | | | | | | | | | | | possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517
* [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asmTeresa Johnson2016-10-301-8/+0
| | | | | | | | | | | | | | | | | | | | | Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513
* [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)Teresa Johnson2016-10-291-3/+3
| | | | | | Rename as suggested in code review for D26063. llvm-svn: 285508
* [ThinLTO] Use NoPromote flag in summary during promotionTeresa Johnson2016-10-291-13/+19
| | | | | | | | | | | | | | | | Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507
* [InstCombine] re-use bitcasted compare operands in selects (PR28001)Sanjay Patel2016-10-291-0/+50
| | | | | | | | | | | These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495
OpenPOWER on IntegriCloud