summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [DAE] don't remove args of musttail target/callerReid Kleckner2018-03-011-3/+37
| | | | | | | | | | | | | `musttail` requires identical signatures of caller and callee. Removing arguments breaks `musttail` semantics. PR36441 Patch by Fedor Indutny Differential Revision: https://reviews.llvm.org/D43708 llvm-svn: 326394
* [InstCombine] simplify code for X * -1.0 --> -X; NFCSanjay Patel2018-02-281-7/+3
| | | | | | I've added random FMF to one of the tests to show those are propagated. llvm-svn: 326377
* [GlobalOpt] don't change CC of musttail calle(e|r)Jonas Devlieghere2018-02-281-1/+24
| | | | | | | | | | | | | | | When the function has musttail call - its cc is fixed to be equal to the cc of the musttail callee. In such case (and in the case of the musttail callee), GlobalOpt should not change the cc to fastcc as it will break the invariant. This fixes PR36546 Patch by: Fedor Indutny (indutny) Differential revision: https://reviews.llvm.org/D43859 llvm-svn: 326376
* [InstCombine] Split the FP constant code out of lookThroughFPExtensions and ↵Craig Topper2018-02-281-15/+20
| | | | | | | | | | | | | | use nullptr as a sentinel Currently this code's control flow very much assumes that there are no meaningful checks after determining that it's a ConstantFP. So whenever it wants to stop it just does "return V". But V is also the variable name it uses when it wants to return a new value. So 'return V' appears multiple times with different meanings. This patch just moves all the code into a helper function and returns nullptr when it wants to stop. I've split this from D43774 while I try to figure out how to best handle the vector case there. But this change by itself at least seemed like a readability improvement. Differential Revision: https://reviews.llvm.org/D43833 llvm-svn: 326361
* [InstrProfiling] Emit the runtime hook when no counters are loweredVedant Kumar2018-02-281-12/+13
| | | | | | | | | | | | | | | | | | | | | | The API verification tool tapi has difficulty processing frameworks which enable code coverage, but which have no code. The profile lowering pass does not emit the runtime hook in this case because no counters are lowered. While the hook is not needed for program correctness (the profile runtime doesn't have to be linked in), it's needed to allow tapi to validate the exported symbol set of instrumented binaries. It was not possible to add a workaround in tapi for empty binaries due to an architectural issue: tapi generates its expected symbol set before it inspects a binary. Changing that model has a higher cost than simply forcing llvm to always emit the runtime hook. rdar://36076904 Differential Revision: https://reviews.llvm.org/D43794 llvm-svn: 326350
* [InstCombine] move invariant call out of loop; NFCSanjay Patel2018-02-281-4/+4
| | | | | | We really shouldn't need a 2-loop here at all, but that's another cleanup. llvm-svn: 326330
* [InstCombine] move constant check into foldBinOpIntoSelectOrPhi; NFCISanjay Patel2018-02-286-34/+29
| | | | | | | | Also, rename 'foldOpWithConstantIntoOperand' because that's annoyingly vague. The constant check is redundant in some cases, but it allows removing duplication for most of the calls. llvm-svn: 326329
* Fix typo. NFCXin Tong2018-02-281-1/+1
| | | | llvm-svn: 326319
* [MergeICmp] Fix a bug in MergeICmp that can lead to a block being processed ↵Xin Tong2018-02-281-0/+13
| | | | | | | | | | | | | | | | | | | | | more than once. Summary: Fix a bug in MergeICmp that can lead to a BCECmp block being processed more than once and eventually lead to a broken LLVM module. The problem is that if the non-constant value is not produced by the last block, the producer will be processed once when the its parent block is processed and second time when the last block is processed. We end up having 2 same BCECmpBlock in the merge queue. And eventually lead to a broken LLVM module. Reviewers: courbet, davide Reviewed By: courbet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43825 llvm-svn: 326318
* [Dominators] Remove verifyDomTree and add some verifying for Post Dom TreesDavid Green2018-02-286-19/+10
| | | | | | | | | | | | Removes verifyDomTree, using assert(verify()) everywhere instead, and changes verify a little to always run IsSameAsFreshTree first in order to print good output when we find errors. Also adds verifyAnalysis for PostDomTrees, which will allow checking of PostDomTrees it the same way we check DomTrees and MachineDomTrees. Differential Revision: https://reviews.llvm.org/D41298 llvm-svn: 326315
* [NewGVN] Update phi-of-ops def block when updating existing ValuePHI.Florian Hahn2018-02-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In case we update a ValuePHI node created earlier, we could update it based on a different OpPHI which could be in a different block. We need to update the TempToBlock mapping reflecting the new block, otherwise we would end up placing the new phi node in a wrong block. This problem is exposed by the test case in https://bugs.llvm.org/show_bug.cgi?id=36504. This patch fixes a slightly simpler problem than in the bug report. In the bug's re-producer, the additional problem is that we are re-using a ValuePHI node with to few incoming values for the new OpPHI. If this patch makes sense, I will follow it up with a patch that creates a new PHI node if the existing PHI node has a different number of incoming values. Reviewers: davide, dberlin Reviewed By: dberlin Differential Revision: https://reviews.llvm.org/D43770 llvm-svn: 326181
* [InstCombine] allow fdiv folds with less than fully 'fast' opsSanjay Patel2018-02-261-13/+3
| | | | | | | | | | | | | | | | | | Note: gcc appears to allow this fold with -freciprocal-math alone, but clang/llvm require more than that with this patch. The wording in the definitions seems fuzzy enough that it could go either way, but we'll err on the conservative side of FMF interpretation. This patch also changes the newly created fmul to have FMF propagated by the last fdiv rather than intersecting the FMF of the fdivs. This matches the behavior of other folds near here. The new fmul is only used to produce an intermediate op for the final fdiv result, so it shouldn't be any stricter than that result. The previous behavior could result in dropping FMF via other folds in instcombine or CSE. Differential Revision: https://reviews.llvm.org/D43398 llvm-svn: 326098
* [LV] Move isLegalMasked* functions from Legality to CostModelRenato Golin2018-02-261-101/+124
| | | | | | | | | | | | | | | | | | | | | | | All SIMD architectures can emulate masked load/store/gather/scatter through element-wise condition check, scalar load/store, and insert/extract. Therefore, bailing out of vectorization as legality failure, when they return false, is incorrect. We should proceed to cost model and determine profitability. This patch is to address the vectorizer's architectural limitation described above. As such, I tried to keep the cost model and vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning should be done separately. Please see http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for RFC and the discussions. Closes D43208. Patch by: Hideki Saito <hideki.saito@intel.com> llvm-svn: 326079
* [LoopInterchange] Loops with empty dependency matrix are safe.Florian Hahn2018-02-261-3/+0
| | | | | | | | | | | | | | | | | | | | The dependency matrix is only empty if no conflicting load/store instructions have been found. In that case, it is safe to interchange. For the LLVM test-suite, after this change around 1900 loops are interchanged, whereas it is 15 before this change. On cortex-a57, this gives an improvement of -0.57% on the geomean execution time of SPEC2006, SPEC2000 and the test-suite. There are a few small perf regressions, but I think we can improve on those by making the cost model better. Reviewers: karthikthecool, mcrosier Reviewed by: karthikthecool Differential Revision: https://reviews.llvm.org/D43236 llvm-svn: 326077
* Revert "StructurizeCFG: Test for branch divergence correctly"Adam Nemet2018-02-241-12/+3
| | | | | | | | This reverts commit r325881. Breaks many bots llvm-svn: 326037
* [InstCombine] simplify code for fabs(X) * fabs(X) -> X * X; NFCSanjay Patel2018-02-231-13/+4
| | | | llvm-svn: 325968
* [InstSimplify] sqrt(X) * sqrt(X) --> XSanjay Patel2018-02-231-4/+0
| | | | | | This was misplaced in InstCombine. We can loosen the FMF as a follow-up step. llvm-svn: 325965
* [InstCombine] allow fmul-sqrt folds with less than full -ffast-mathSanjay Patel2018-02-231-15/+8
| | | | | | Also, add a Builder method for intrinsics to reduce code duplication for clients. llvm-svn: 325960
* [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.Matt Davis2018-02-232-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: ``` int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } ``` In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Reviewers: mzolotukhin, aprantl, vsk, davide Reviewed By: aprantl, vsk Subscribers: dberlin, llvm-commits Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 325926
* [InstCombine] refactor fmul with negated op folds; NFCISanjay Patel2018-02-231-24/+18
| | | | | | | | | | | | | | The existing code was inefficiently looking for 'nsz' variants. That's unnecessary because we canonicalize those to the expected form with -0.0. We may also want to adjust or remove the fold that sinks negation. We don't do that for fdiv (or integer ops?). That should be uniform? It may also lead to missed optimization as in PR21914: https://bugs.llvm.org/show_bug.cgi?id=21914 ...or we just have to fix other passes to avoid that problem. llvm-svn: 325924
* [InstCombine] use FMF-copying functions to reduce code; NFCISanjay Patel2018-02-231-28/+12
| | | | llvm-svn: 325923
* StructurizeCFG: Test for branch divergence correctlyNicolai Haehnle2018-02-231-3/+12
| | | | | | | | | | | | | | | | | | | | | | Summary: This fixes cases like the new test @nonuniform. In that test, %cc itself is a uniform value; however, when reading it after the end of the loop in basic block %if, its value is effectively non-uniform. This problem was encountered in https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change in itself is not sufficient to fix that bug, as there is another issue in the AMDGPU backend. Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4 Reviewers: arsenm, rampitec, jlebar Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D40546 llvm-svn: 325881
* Mark MergedLoadStoreMotion as not preserving MemDep resultsBjorn Steinbrink2018-02-231-41/+8
| | | | | | | | | | | | | | | | | | Summary: MemDep caches results that signify that a dependence is non-local, and there is currently no way to invalidate such cache entries. Unfortunately, when MLSM sinks a store that can result in a non-local dependence becoming a local one, and then MemDep gives wrong answers. The easiest way out here is to just say that MLSM does indeed not preserve MemDep results. Reviewers: davide, Gerolf Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43177 llvm-svn: 325880
* Update comment for whether or not we can optimize an alias - we'reEric Christopher2018-02-221-1/+1
| | | | | | | checking the alias and not the aliasee. If the alias can be interposed then we shouldn't do anything. llvm-svn: 325837
* Fix DataFlowSanitizer instrumentation pass to take parameter position ↵Peter Collingbourne2018-02-221-12/+89
| | | | | | | | | | | | | | changes into account for custom functions. When DataFlowSanitizer transforms a call to a custom function, the new call has extra parameters. The attributes on parameters must be updated to take the new position of each parameter into account. Patch by Sam Kerner! Differential Revision: https://reviews.llvm.org/D43132 llvm-svn: 325820
* [AlignmentFromAssumptions] Set source and dest alignments of memory ↵Daniel Neilson2018-02-221-44/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | intrinsiscs separately Summary: This change is part of step five in the series of changes to remove alignment argument from memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the AlignmentFromAssumptions pass to cease using the old getAlignment()/setAlignment API of MemoryIntrinsic in favour of getting/setting source & dest specific alignments through the new API. This allows us to simplify some of the code in this pass and also be more aggressive about setting the source and destination alignments separately. Steps: Step 1) Remove alignment parameter and create alignment parameter attributes for memcpy/memmove/memset. ( rL322965, rC322964, rL322963 ) Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. ( rL323597 ) Step 3) Update Clang to use the new IRBuilder API. ( rC323617 ) Step 4) Update Polly to use the new IRBuilder API. ( rL323618 ) Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment() and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278, rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774, rL324781, rL324784, rL324955, rL324960 ) Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get|set]Alignment() methods. Reference http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html Reviewers: hfinkel, bollu, reames Reviewed By: reames Subscribers: reames, llvm-commits Differential Revision: https://reviews.llvm.org/D43081 llvm-svn: 325816
* [FunctionAttrs][ArgumentPromotion][GlobalOpt] Disable some optimisations ↵Luke Cheeseman2018-02-223-2/+15
| | | | | | | | | | | | | | passes for naked functions - Fix for bug 36078. - Prevent the functionattrs, function-attrs, globalopt and argpromotion passes from changing naked functions. - These passes can perform some alterations to the functions that should not be applied. An example is removing parameters that are seemingly not used because they are only referenced in the inline assembly. Another example is marking the function as fastcc. llvm-svn: 325788
* [SampleProf] NFC. Expose reusable functionality in SampleProfile.Mircea Trofin2018-02-221-29/+9
| | | | | | | | | | | | | | | | | | Summary: Exposing getOffset and findFunctionSamples as members of SampleProfile. They are intimately tied to design choices of the sample profile format - using offsets instead of line numbers, and traversing inlined functions stack, respectively. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43605 llvm-svn: 325747
* [Utils] Avoid a hash table lookup in salvageDI, NFCVedant Kumar2018-02-221-0/+5
| | | | | | | | | | | According to the current coverage report salvageDebugInfo() is called 5.12 million times during testing and almost always returns early. The early return depends on LocalAsMetadata::getIfExists returning null, which involves a DenseMap lookup in an LLVMContextImpl. We can probably speed this up by simply checking the IsUsedByMD bit in Value. llvm-svn: 325738
* [InstCombine] add and use Create*FMF functions; NFCSanjay Patel2018-02-211-15/+7
| | | | llvm-svn: 325730
* [hwasan] Fix inline instrumentation.Evgeniy Stepanov2018-02-211-5/+19
| | | | | | | | | | | | | | | This patch changes hwasan inline instrumentation: Fixes address untagging for shadow address calculation (use 0xFF instead of 0x00 for the top byte). Emits brk instruction instead of hlt for the kernel and user space. Use 0x900 instead of 0x100 for brk immediate (0x100 - 0x800 are unavailable in the kernel). Fixes and adds appropriate tests. Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D43135 llvm-svn: 325711
* [BDCE] Salvage debug info from dying instsVedant Kumar2018-02-211-0/+2
| | | | | | | | This results in 15 additional unique source variables in a stage2 build of FileCheck (at '-Os -g'), with a negligible increase in the size of the .debug_loc section. llvm-svn: 325660
* [InstCombine] C / -X --> -C / XSanjay Patel2018-02-211-8/+17
| | | | | | | | | We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. This is similar to rL325648. llvm-svn: 325649
* [InstCombine] -X / C --> X / -C for FPSanjay Patel2018-02-201-5/+12
| | | | | | | We already do this in DAGCombiner, but it should also be good to eliminate the fsub use in IR. llvm-svn: 325648
* [DSE] Don't DSE stores that subsequent memmove calls read fromSanjoy Das2018-02-201-16/+27
| | | | | | | | | | | | | | | | | | | | | | Summary: We used to remove the first memmove in cases like this: memmove(p, p+2, 8); memmove(p, p+2, 8); which is incorrect. Fix this by changing isPossibleSelfRead to what was most likely the intended behavior. Historical note: the buggy code was added in https://reviews.llvm.org/rL120974 to address PR8728. Reviewers: rsmith Subscribers: mcrosier, llvm-commits, jlebar Differential Revision: https://reviews.llvm.org/D43425 llvm-svn: 325641
* [InstCombine] remove unneeded operand swap: NFCISanjay Patel2018-02-201-3/+0
| | | | | | | FMul is commutative, so complexity-based canonicalization should always take care of the swap via SimplifyAssociativeOrCommutative(). llvm-svn: 325628
* [InstCombine] remove unneeded dyn_cast to prevent unused variable warningSanjay Patel2018-02-201-2/+1
| | | | llvm-svn: 325597
* [InstCombine] remove compound fdiv pattern foldsSanjay Patel2018-02-201-27/+1
| | | | | | | | | | | | | | These are fdiv-with-constant-divisor, so they already become reciprocal multiplies. The last gap for vector ops should be closed with rL325590. It's possible that we're missing folds for some edge cases with denormal intermediate constants after deleting these, but there are no tests for those patterns, and it would be better to handle denormals more consistently (and less conservatively) as noted in TODO comments. llvm-svn: 325595
* [InstCombine] fold fdiv with non-splat divisor to fmul: X/C --> X * (1/C)Sanjay Patel2018-02-201-21/+16
| | | | llvm-svn: 325590
* [InstCombine] use CreateWithCopiedFlags to reduce code; NFCISanjay Patel2018-02-191-7/+6
| | | | | | Also, move the folds with constants closer to make it easier to follow. llvm-svn: 325541
* Revert "[mem2reg] Use range loops (NFCI)"Brian Gesiak2018-02-191-8/+9
| | | | | | This reverts commit r325532. llvm-svn: 325539
* [InstCombine] allow fdiv with constant dividend folds with less than full ↵Sanjay Patel2018-02-191-2/+3
| | | | | | | | | | | | -ffast-math It's possible that we could allow this either 'arcp' or 'reassoc' alone, but this should be conservatively better than what we have right now. GCC allows this with only -freciprocal-math. The last test is changed to show a case that is expected to fold, but we need D43398. llvm-svn: 325533
* [mem2reg] Use range loops (NFCI)Brian Gesiak2018-02-191-9/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Several for loops in PromoteMemoryToRegister.cpp leave their increment expression empty, instead incrementing the iterator within the for loop body. I believe this is because these loops were previously implemented as while loops; see https://reviews.llvm.org/rL188327. Incrementing the iterator within the body of the for loop instead of in its increment expression makes it seem like the iterator will be modified or conditionally incremented within the loop, but that is not the case in these loops. Instead, use range loops. Test Plan: `check-llvm` Reviewers: davide, bkramer Reviewed By: davide, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43473 llvm-svn: 325532
* [InstCombine] refactor fdiv with constant dividend folds; NFCSanjay Patel2018-02-191-26/+27
| | | | | | | | | The last fold that used to be here was not necessary. That's a combination of 2 folds (and there's a regression test to show that). The transforms are guarded by isFast(), but that should be loosened. llvm-svn: 325531
* [Coroutines] Move debug statement before assertBrian Gesiak2018-02-191-1/+2
| | | | | | | | | | Summary: Move a debug statement to above where an assertion is hit, so that the debug statement can be inspected before a stack trace. Test Plan: `check-llvm` llvm-svn: 325529
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-191-1/+1
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Third attempt - moved function from lambda to static function due to build failures. llvm-svn: 325506
* [Transforms] Propagate new-format TBAA tags on simplification of ↵Ivan A. Kosarev2018-02-191-1/+3
| | | | | | | | | | | | | | memory-transfer intrinsics With this patch in place, when a new-format TBAA tag is available for a memory-transfer intrinsic call, we prefer propagating that new-format tag. Otherwise, we fallback to the old approach where we try to construct a proper TBAA access tag from 'tbaa.struct' metadata. Differential Revision: https://reviews.llvm.org/D41543 llvm-svn: 325488
* Revert: [llvm] r325448 - [ThinLTO] Add GraphTraits for FunctionSummaries Simon Pilgrim2018-02-181-1/+1
| | | | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). Reverted due to buildbot failures llvm-svn: 325454
* [ThinLTO] Add GraphTraits for FunctionSummariesCharles Saternos2018-02-171-1/+1
| | | | | | | | Add GraphTraits definitions to the FunctionSummary and ModuleSummaryIndex classes. These GraphTraits will be used to construct find SCC's in ThinLTO analysis passes. Second attempt, since last patch caused stage2 build to fail (now using function_ref rather than std::function). llvm-svn: 325448
* [Constant] add floating-point helpers for normal/finite-nz; NFCSanjay Patel2018-02-161-42/+13
| | | | | | | | | ...and delete the equivalent local functiona from InstCombine. These might be useful to other InstCombine files or other passes and makes FP queries more similar to integer constant queries. llvm-svn: 325398
OpenPOWER on IntegriCloud