summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Fix a typo.Evgeniy Stepanov2016-10-311-1/+1
| | | | | | Found with PVS-Studio here: http://www.viva64.com/en/b/0446/ llvm-svn: 285652
* [asan] Move instrumented null-terminated strings to a special section, LLVM partKuba Brecka2016-10-311-0/+8
| | | | | | | | On Darwin, simple C null-terminated constant strings normally end up in the __TEXT,__cstring section of the resulting Mach-O binary. When instrumented with ASan, these strings are transformed in a way that they cannot be in __cstring (the linker unifies the content of this section and strips extra NUL bytes, which would break instrumentation), and are put into a generic __const section. This breaks some of the tools that we have: Some tools need to scan all C null-terminated strings in Mach-O binaries, and scanning all the contents of __const has a large performance penalty. This patch instead introduces a special section, __asan_cstring which will now hold the instrumented null-terminated strings. Differential Revision: https://reviews.llvm.org/D25026 llvm-svn: 285619
* Second attempt at r285517.Dorit Nuzman2016-10-311-11/+65
| | | | llvm-svn: 285568
* Revert r285517 due to build failures.Dorit Nuzman2016-10-301-59/+1
| | | | llvm-svn: 285518
* [LoopVectorize] Make interleaved-accesses analysis less conservative aboutDorit Nuzman2016-10-301-1/+59
| | | | | | | | | | | | | | | | | | | | | possible pointer-wrap-around concerns, in some cases. Before this patch, collectConstStridedAccesses (part of interleaved-accesses analysis) called getPtrStride with [Assume=false, ShouldCheckWrap=true] when examining all candidate pointers. This is too conservative. Instead, this patch makes collectConstStridedAccesses use an optimistic approach, calling getPtrStride with [Assume=true, ShouldCheckWrap=false], and then, once the candidate interleave groups have been formed, revisits the pointer-wrapping analysis but only where it matters: namely, in groups that have gaps, and where the gaps are not at the very end of the group (in which case the loop is peeled). This second time getPtrStride is called with [Assume=false, ShouldCheckWrap=true], but this could further be improved to using Assume=true, once we also add the logic to track that we are not going to meet the scev runtime checks threshold. Differential Revision: https://reviews.llvm.org/D25276 llvm-svn: 285517
* [ThinLTO] Use per-summary flag to prevent exporting locals used in inline asmTeresa Johnson2016-10-301-8/+0
| | | | | | | | | | | | | | | | | | | | | Summary: Instead of using the workaround of suppressing the entire index for modules that call inline asm that may reference locals, use the NoRename flag on the summary for any locals in the llvm.used set, and add a reference edge from any functions containing inline asm. This avoids issues from having no summaries despite the module defining global values, which was preventing more aggressive index-based optimization. It will be followed by a subsequent patch to make a similar fix for local references in module level asm (to fix PR30610). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26121 llvm-svn: 285513
* [ThinLTO] Rename doPromoteLocalToGlobal to shouldPromoteLocalToGlobal (NFC)Teresa Johnson2016-10-291-3/+3
| | | | | | Rename as suggested in code review for D26063. llvm-svn: 285508
* [ThinLTO] Use NoPromote flag in summary during promotionTeresa Johnson2016-10-291-13/+19
| | | | | | | | | | | | | | | | Summary: Replace the check of whether a GV has a section with the flag check in the summary. This is in preparation for using the NoPromote flag to convey other situations when we can't promote (e.g. locals used in inline asm). Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26063 llvm-svn: 285507
* [InstCombine] re-use bitcasted compare operands in selects (PR28001)Sanjay Patel2016-10-291-0/+50
| | | | | | | | | | | These mixed bitcast patterns show up with SSE/AVX intrinsics because we bitcast function parameters to <2 x i64>. The bitcasts obfuscate the expected min/max forms as shown in PR28001: https://llvm.org/bugs/show_bug.cgi?id=28001#c6 Differential Revision: https://reviews.llvm.org/D25943 llvm-svn: 285495
* Don't leave unused divs/rems sitting around in BypassSlowDivision.Justin Lebar2016-10-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This "pass" eagerly creates div and rem instructions even when only one is needed -- it relies on a later pass (machine DCE?) to clean them up. This is problematic not just from a cleanliness perspective (this pass is running during CodeGenPrepare, so should leave the IR in a better state), but it also creates a problem for instruction selection. If we always have a div+rem, isel will always select a divrem instruction (if possible), even when a single div or rem would do. Specifically, in NVPTX, we want to compute rem from the output of div, if available. But if a div is not available, we want to leave the rem alone. This transformation is overeager if div is always available. Because this code runs as part of CodeGenPrepare, it's nontrivial to write a test for this change. But this will effectively be tested by a later patch which adds the aforementioned change to NVPTX isel. Reviewers: tra Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26088 llvm-svn: 285460
* Don't claim the udiv created in BypassSlowDivision is exact.Justin Lebar2016-10-281-2/+1
| | | | | | | | | | | | | | | | | | | Summary: In BypassSlowDivision's short-dividend path, we would create e.g. udiv exact i32 %a, %b "exact" here means that we are asserting that %a is a multiple of %b. But we have no reason to believe this must be true -- this is just a bug, as far as I can tell. Reviewers: tra Subscribers: jholewinski, llvm-commits Differential Revision: https://reviews.llvm.org/D26097 llvm-svn: 285459
* SpeculativeExecution: Allow speculating more inst typesMatt Arsenault2016-10-281-0/+18
| | | | | | | Partial step towards removing the whitelist and only using TTI's cost. llvm-svn: 285438
* [MemorySSA] Add const to getClobberingMemoryAccess.George Burgess IV2016-10-281-3/+3
| | | | | | | | Thanks to bryant for the patch! Differential Revision: https://reviews.llvm.org/D26086 llvm-svn: 285432
* [LCSSA] Perform LCSSA verification only for the current loop nest.Igor Laevsky2016-10-282-4/+30
| | | | | | | | | Now LPPassManager will run LCSSA verification only for the top-level loop which was processed on the current iteration. Differential Revision: https://reviews.llvm.org/D25873 llvm-svn: 285394
* [Reassociate] Removing instructions mutates the IR.Davide Italiano2016-10-281-1/+3
| | | | | | | | | Fixes PR 30784. Discussed with Justin, who pointed out that in the new PassManager infrastructure we can have more fine-grained control on which analyses we want to preserve, but this is the best we can do with the current infrastructure. llvm-svn: 285380
* [ThinLTO] Rename HasSection to NoRename (NFC)Teresa Johnson2016-10-281-2/+3
| | | | | | | | | | | | | | Summary: This is in preparation for a change to utilize this flag for symbols referenced/defined in either inline or module level assembly. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26048 llvm-svn: 285376
* [InstCombine] fix foldSPFofSPF() to handle vector splatsSanjay Patel2016-10-271-22/+18
| | | | llvm-svn: 285345
* [LoopUnroll] Check partial unrolling is enabled before initialization. NFC.Haicheng Wu2016-10-271-2/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D23891 llvm-svn: 285330
* [InstCombine] handle simple vector integer constants in IsFreeToInvertSanjay Patel2016-10-271-0/+18
| | | | llvm-svn: 285318
* Add Loop Sink pass to reverse the LICM based of basic block frequency.Dehao Chen2016-10-274-14/+337
| | | | | | | | | | | | Summary: LICM may hoist instructions to preheader speculatively. Before code generation, we need to sink down the hoisted instructions inside to loop if it's beneficial. This pass is a reverse of LICM: looking at instructions in preheader and sinks the instruction to basic blocks inside the loop body if basic block frequency is smaller than the preheader frequency. Reviewers: hfinkel, davidxl, chandlerc Subscribers: anna, modocache, mgorny, beanz, reames, dberlin, chandlerc, mcrosier, junbuml, sanjoy, mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D22778 llvm-svn: 285308
* [SLP] Fix for PR30626: Compiler crash inside SLP Vectorizer.Alexey Bataev2016-10-271-4/+11
| | | | | | | | | | | | | | | | After successfull horizontal reduction vectorization attempt for PHI node vectorizer tries to update root binary op by combining vectorized tree and the ReductionPHI node. But during vectorization this ReductionPHI can be vectorized itself and replaced by the `undef` value, while the instruction itself is marked for deletion. This 'marked for deletion' PHI node then can be used in new binary operation, causing "Use still stuck around after Def is destroyed" crash upon PHI node deletion. Also the test is fixed to make it perform actual testing. Differential Revision: https://reviews.llvm.org/D25671 llvm-svn: 285286
* Introduce updateDiscriminator interface to DILocation to make it cleaner ↵Dehao Chen2016-10-261-30/+7
| | | | | | | | | | | | | | assigning discriminators. Summary: This patch introduces updateDiscriminator to DILocation so that it can be directly called by AddDiscriminator. It also makes it easier to update the discriminator later. Reviewers: dnovillo, dblaikie, aprantl, echristo Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D25959 llvm-svn: 285207
* [InstCombine] clean up commonCastTransforms; NFCSanjay Patel2016-10-261-11/+9
| | | | | | | | 1. Use 'auto' with dyn_cast. 2. Variables start with a capital letter. 3. Use proper punctuation in comments. llvm-svn: 285200
* [IndVarSimplify][DebugLoc] When widening the exit loop condition, correctly ↵Andrea Di Biagio2016-10-261-0/+5
| | | | | | | | | | | | | | | reuse the debug location of the original comparison. When the loop exit condition is canonicalized as a != compaison, reuse the debug location of the original (non canonical) comparison. Before this patch, the debug location of the new icmp was obtained from the loop latch terminator. This patch fixes the issue by correctly setting the IRBuilder's "current debug location" to the location of the original compare. Differential Revision: https://reviews.llvm.org/D25953 llvm-svn: 285185
* Cloning: Also clone global variable attached metadata.Peter Collingbourne2016-10-261-0/+5
| | | | llvm-svn: 285161
* Utility functions for appending to llvm.used/llvm.compiler.used.Evgeniy Stepanov2016-10-253-48/+48
| | | | llvm-svn: 285143
* [PGO] Fix select instruction annotationRong Xu2016-10-251-4/+13
| | | | | | | | | | | | | | | | Summary: Select instruction annotation in IR PGO uses the edge count to infer the branch count. It's currently placed in setInstrumentedCounts() where no all the BB counts have been computed. This leads to wrong branch weights. Move the annotation after all BB counts are populated. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25961 llvm-svn: 285128
* [InstCombine] Resubmit the combine of A->B->A BitCast and fix for pr27996Guozhi Wei2016-10-252-0/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original patch of the A->B->A BitCast optimization was reverted by r274094 because it may cause infinite loop inside compiler https://llvm.org/bugs/show_bug.cgi?id=27996. The problem is with following code xB = load (type B); xA = load (type A); +yA = (A)xB; B -> A +zAn = PHI[yA, xA]; PHI +zBn = (B)zAn; // A -> B store zAn; store zBn; optimizeBitCastFromPhi generates +zBn = (B)zAn; // A -> B and expects it will be combined with the following store instruction to another store zAn Unfortunately before combineStoreToValueType is called on the store instruction, optimizeBitCastFromPhi is called on the new BitCast again, and this pattern repeats indefinitely. optimizeBitCastFromPhi only generates BitCast for load/store instructions, only the BitCast before store can cause the reexecution of optimizeBitCastFromPhi, and BitCast before store can easily be handled by InstCombineLoadStoreAlloca.cpp. So the solution to the problem is if all users of a CI are store instructions, we should not do optimizeBitCastFromPhi on it. Then optimizeBitCastFromPhi will not be called on the new BitCast instructions. Differential Revision: https://reviews.llvm.org/D23896 llvm-svn: 285116
* [InstCombine] Ensure that truncated int types are legal.Sanjay Patel2016-10-251-4/+2
| | | | | | | | | | Fixes the FIXMEs in D25952 and rL285075. Patch by bryant! Differential Revision: https://reviews.llvm.org/D25955 llvm-svn: 285108
* [LV] Sink scalar operands of predicated instructionsMatthew Simpson2016-10-251-13/+78
| | | | | | | | | | | | | When we predicate an instruction (div, rem, store) we place the instruction in its own basic block within the vectorized loop. If a predicated instruction has scalar operands, it's possible to recursively sink these scalar expressions into the predicated block so that they might avoid execution. This patch sinks as much scalar computation as possible into predicated blocks. We previously were able to sink such operands only if they were extractelement instructions. Differential Revision: https://reviews.llvm.org/D25632 llvm-svn: 285097
* Add -strip-nonlinetable-debuginfo capabilityMichael Ilseman2016-10-253-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. Thanks to Adrian Prantl for stewarding this patch! llvm-svn: 285094
* Fix 80-char violations. NFC.Michael Kuperstein2016-10-251-5/+10
| | | | llvm-svn: 285092
* Move discriminator assignment to where it is used. (NFC)Dehao Chen2016-10-251-1/+1
| | | | llvm-svn: 285084
* [IndVarSimplify][Dwarf] When widening the IV increment, correctly set the ↵Andrea Di Biagio2016-10-251-0/+5
| | | | | | | | | | | | | | | debug loc. When indvars widened an induction variable, the debug location for the loop increment computation was incorrectly set equal to the debug loc of the loop latch terminator. This patch fixes the issue by propagating the correct location from the original loop increment instruction to the new widened increment. Differential Revision: https://reviews.llvm.org/D25872 llvm-svn: 285083
* [EarlyCSE] Make MemorySSA memory dependency check more aggressive.Geoff Berry2016-10-251-16/+6
| | | | | | | | | | Now that MemorySSA keeps track of whether MemoryUses are optimized, use getClobberingMemoryAccess() to check MemoryUse memory dependencies since it should no longer be so expensive. This is a follow-up change to https://reviews.llvm.org/D25881 llvm-svn: 285080
* fix formatting; NFCSanjay Patel2016-10-251-13/+13
| | | | llvm-svn: 285078
* [InstCombine] add test and code comment to show potentially misguided icmp ↵Sanjay Patel2016-10-251-0/+3
| | | | | | trunc transform llvm-svn: 285075
* GlobalDCE: Restore a statement accidentally removed in r285048.Peter Collingbourne2016-10-251-0/+1
| | | | llvm-svn: 285052
* GlobalDCE: Deduplicate code. NFCI.Peter Collingbourne2016-10-251-35/+18
| | | | llvm-svn: 285048
* Merge two if conditions into one. NFCI.Davide Italiano2016-10-241-3/+2
| | | | llvm-svn: 285008
* add-discriminators: Fix handling of lexical scopes.Adrian Prantl2016-10-241-9/+13
| | | | | | | | | | | | | | | This fixes a bug in the handling of lexical scopes, when more than one scope is defined on the same line or functions are inlined into call sites that are on the same line as the function definition. This situation can easily happen in macro expansions. The problem is solved by introducing a SmallDenseMap<DIScope *, DILexicalBlockFile *, 1> that keeps track of all the different lexical scopes that share a line/file location. Fixes PR30681. llvm-svn: 284998
* Check the number of Args in LibCallsShrinkWrap.Rong Xu2016-10-241-0/+2
| | | | | | Some library fucntions can have no argument. llvm-svn: 284989
* [EarlyCSE] Optimize MemoryPhis and reduce memory clobber queries w/ MemorySSAGeoff Berry2016-10-241-10/+44
| | | | | | | | | | | | | | | | | | | | | Summary: When using MemorySSA, re-optimize MemoryPhis when removing a store since this may create MemoryPhis with all identical arguments. Also, when using MemorySSA to check if two MemoryUses are reading from the same version of the heap, use the defining access instead of calling getClobberingAccess, since the latter can currently result in many more AA calls. Once the MemorySSA use optimization tracking changes are done, we can remove this limitation, which should result in more loads being CSE'd. Reviewers: dberlin Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D25881 llvm-svn: 284984
* Revert 284971.Nico Weber2016-10-241-73/+38
| | | | | | | | | It seems to break selfhost on some bots, see e.g. http://lab.llvm.org:8011/builders/clang-x86-windows-msvc2015/builds/21 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/20 http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/22 llvm-svn: 284979
* [JumpThreading] Unfold selects that depend on the same conditionPablo Barrio2016-10-241-38/+73
| | | | | | | | | | | | | | | | | | Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: reames, bkramer, mcrosier, gberry, haicheng, jmolloy, sebpop Subscribers: jojo, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D25477 llvm-svn: 284971
* Now that VS2013 is gone, make a memoryssa structure an anonymous union againDaniel Berlin2016-10-221-4/+4
| | | | llvm-svn: 284910
* [CtorUtils] Modernize. No functional changes intended.Davide Italiano2016-10-221-5/+5
| | | | llvm-svn: 284904
* Analysis: Move llvm::getConstantRangeFromMetadata to IR library.Peter Collingbourne2016-10-212-0/+2
| | | | | | | | We're about to start using it there. Differential Revision: https://reviews.llvm.org/D25877 llvm-svn: 284865
* [StripGCRelocates] New pass to remove gc.relocates added by RS4GCAnna Thomas2016-10-213-0/+82
| | | | | | | | | | | | | | | | | | | | | | | | Summary: Utility pass to remove gc.relocates created by rewrite statepoints for GC. With respect to safepoint verification, the IR generated would be incorrect, and cannot run as such. This would be a single transformation on the final optimized IR. The benefit of the pass is for easy analysis when the IRs are 'polluted' by too many gc.relocates. Added tests. test run: All RS4GC tests with -verify option. Local downstream tests on large IR files. This also works when the pointer being gc.relocated is another gc.relocate. Reviewers: sanjoy, reames Subscribers: beanz, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D25096 llvm-svn: 284855
* [LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loopsJohn Brawn2016-10-212-16/+25
| | | | | | | | | | | | | | | | When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 llvm-svn: 284818
OpenPOWER on IntegriCloud