summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [InstCombine] Do not exercise nested max/min pattern on absAnna Thomas2017-02-211-1/+3
| | | | | | | | | | | | | | | | | | | Summary: This is a fix for assertion failure in `getInverseMinMaxSelectPattern` when ABS is passed in as a select pattern. We should not be invoking the simplification rule for ABS(MIN(~ x,y))) or ABS(MAX(~x,y)) combinations. Added a test case which would cause an assertion failure without the patch. Reviewers: sanjoy, majnemer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30051 llvm-svn: 295719
* The patch introduces new way of narrowing complex (>UINT16 variants) solutions.Evgeny Stupachenko2017-02-211-1/+159
| | | | | | | | | | | | | | | | | | | The new method introduced under "-lsr-exp-narrow" option (currenlty set to true). Summary: The method is based on registers number mathematical expectation and should be generally closer to optimal solution. Please see details in comments to "LSRInstance::NarrowSearchSpaceByDeletingCostlyFormulas()" function (in lib/Transforms/Scalar/LoopStrengthReduce.cpp). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D29862 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 295704
* Add a wrapper around copy_if in STLExtras; NFCSanjoy Das2017-02-213-34/+33
| | | | | | I will add one more use for this in a later change. llvm-svn: 295685
* [IndVars] Add an assertSanjoy Das2017-02-201-0/+3
| | | | | | | We've already checked that the loop is in simplify form before, but a little paranoia never hurt anyone. llvm-svn: 295680
* MemorySSA: Add support for renaming uses in the updater.Daniel Berlin2017-02-202-25/+68
| | | | | | | | | | | | | | Summary: This lets one add aliasing stores to the updater. (i'm next going to move the creation/etc functions to the updater) Reviewers: george.burgess.iv Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D30154 llvm-svn: 295677
* [SLP] nullptr'ize initial value in `findBuildAggregate()`, NFC.Alexey Bataev2017-02-201-1/+1
| | | | | | Initial value of V is sett nullptr, as it is not used. llvm-svn: 295642
* [SLP] Rework `findBuildAggregate()` from ercursive form to iterative, NFC.Alexey Bataev2017-02-201-9/+12
| | | | | | | | | | Reviewers: mkuper Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30103 llvm-svn: 295641
* Removed extra ';'Simon Pilgrim2017-02-191-1/+1
| | | | llvm-svn: 295603
* Add a DebugCounter for PredicateInfo renaming, and an associated testDaniel Berlin2017-02-191-0/+8
| | | | llvm-svn: 295594
* Fix unused variable warning when assertions are disabled.Simon Pilgrim2017-02-191-4/+4
| | | | llvm-svn: 295587
* NewGVN: Start making use of predicateinfo pass.Daniel Berlin2017-02-181-18/+275
| | | | | | | | | | | | Summary: This begins using the predicateinfo pass in NewGVN. Reviewers: davide Subscribers: llvm-commits, Prazek Differential Revision: https://reviews.llvm.org/D29682 llvm-svn: 295583
* NewGVN: Make ranking prefer undef to constants. Fix direction ofDaniel Berlin2017-02-181-8/+9
| | | | | | shouldSwapOperands to be correct. llvm-svn: 295582
* PredicateInfo: Clean up predicate info a little, using insertionDaniel Berlin2017-02-181-67/+93
| | | | | | helpers, and fixing support for the renaming the comparison. llvm-svn: 295581
* [InstCombine] add nsw/nuw X, signbit --> or X, signbitSanjay Patel2017-02-181-2/+9
| | | | | | | | | Changing to 'or' (rather than 'xor' when no wrapping flags are set) allows icmp simplifies to happen as expected. Differential Revision: https://reviews.llvm.org/D29729 llvm-svn: 295574
* [MemorySSA] NFC small fixesPiotr Padlewski2017-02-181-9/+6
| | | | | | | | | | | | | | Summary: 2 small fixes extracted from https://reviews.llvm.org/D29064 Reviewers: kuhar, davide, dberlin, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30109 llvm-svn: 295566
* Increases full-unroll threshold.Dehao Chen2017-02-182-29/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge): Performance: 403 0.11% 433 0.51% 445 0.48% 447 3.50% 453 1.49% 464 0.75% Code size: 403 0.56% 433 0.96% 445 2.16% 447 2.96% 453 0.94% 464 8.02% The compiler time overhead is similar with code size. Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc Reviewed By: hfinkel, chandlerc Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D28368 llvm-svn: 295538
* OptDiag: Allow constructing DiagnosticLocation from DISubprogramsJustin Bogner2017-02-181-2/+1
| | | | | | | | This avoids creating a DILocation just to represent a line number, since creating Metadata is expensive. Creating a DiagnosticLocation directly is much cheaper. llvm-svn: 295531
* [NewGVN] isOnlyReachableViaThisEdge() is dead now. NFCI.Davide Italiano2017-02-171-18/+0
| | | | llvm-svn: 295503
* [NewGVN] createVariableOrConstant is not required anymore. NFCI.Davide Italiano2017-02-171-8/+0
| | | | llvm-svn: 295500
* WholeProgramDevirt: For VCP use a 32-bit ConstantInt for the byte offset.Peter Collingbourne2017-02-171-1/+1
| | | | | | | | | | | | | | | A future change will cause this byte offset to be inttoptr'd and then exported via an absolute symbol. On the importing end we will expect the symbol to be in range [0,2^32) so that it will fit into a 32-bit relocation. The problem is that on 64-bit architectures if the offset is negative it will not be in the correct range once we inttoptr it. This change causes us to use a 32-bit integer so that it can be inttoptr'd (which zero extends) into the correct range. Differential Revision: https://reviews.llvm.org/D30016 llvm-svn: 295487
* WholeProgramDevirt: Examine the function body when deciding whether ↵Peter Collingbourne2017-02-171-12/+41
| | | | | | | | | | functions are readnone. The goal is to get an analysis result even for de-refineable functions. Differential Revision: https://reviews.llvm.org/D29803 llvm-svn: 295472
* [LV] Remove constant restriction for vector phi creationMatthew Simpson2017-02-171-44/+37
| | | | | | | | | | | | We previously only created a vector phi node for an induction variable if its step had a constant integer type. However, the step actually only needs to be loop-invariant. We only handle inductions having loop-invariant steps, so this patch should enable vector phi node creation for all integer induction variables that will be vectorized. Differential Revision: https://reviews.llvm.org/D29956 llvm-svn: 295456
* InstCombine: fix extraction when performing vector/array punningEugene Leviant2017-02-171-1/+1
| | | | | | Differential revision: https://reviews.llvm.org/D29491 llvm-svn: 295429
* [JumpThreading] Re-enable JumpThreading for guardsSanjoy Das2017-02-172-15/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: JumpThreading for guards feature has been reverted at https://reviews.llvm.org/rL295200 due to the following problem: the feature used the following algorithm for detection of diamond patters: 1. Find a block with 2 predecessors; 2. Check that these blocks have a common single parent; 3. Check that the parent's terminator is a branch instruction. The problem is that these checks are insufficient. They may pass for a non-diamond construction in case if those two predecessors are actually the same block. This may happen if parent's terminator is a br (either conditional or unconditional) to a block that ends with "switch" instruction with exactly two branches going to one block. This patch re-enables the JumpThreading for guards and fixes this issue by adding the check that those found predecessors are actually different blocks. This guarantees that parent's terminator is a conditional branch with exactly 2 different successors, which is now ensured by assertions. It also adds two more tests for this situation (with parent's terminator being a conditional and an unconditional branch). Patch by Max Kazantsev! Reviewers: anna, sanjoy, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30036 llvm-svn: 295410
* Bug 31948: Fix assertion when bitcasting constantexpr pointersMatt Arsenault2017-02-171-0/+6
| | | | llvm-svn: 295387
* [LSR] Prevent formula with SCEVAddRecExpr type of Reg from Sibling loopsWei Mi2017-02-161-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In rL294814, we allow formula with SCEVAddRecExpr type of Reg from loops other than current loop. This is good for the case when induction variable of outerloop being used in expr in innerloop. But it is very bad to allow such Reg from sibling loop because we may need to add lsr.iv in other sibling loops when scev expanding those SCEVAddRecExpr type exprs. For the testcase below, one loop can be inserted with a bunch of lsr.iv because of LSR for other loops. // The induction variable j from a loop in the middle will have initial // value generated from previous sibling loop and exit value used by its // next sibling loop. void goo(long i, long j); long cond; void foo(long N) { long i = 0; long j = 0; i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); i = 0; do { goo(i, j); i++; j++; } while (cond); } The fix is to only allow formula with SCEVAddRecExpr type of Reg from current loop or its parents. Differential Revision: https://reviews.llvm.org/D30021 llvm-svn: 295378
* InstCombine: Canonicalize fast fmuladd to fmul + faddMatt Arsenault2017-02-161-1/+14
| | | | llvm-svn: 295353
* [AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus ↵Craig Topper2017-02-162-4/+9
| | | | | | intrinsics like it does 128/256-bit. llvm-svn: 295294
* PMB: Add an importing WPD pass to the start of the ThinLTO backend pipeline.Peter Collingbourne2017-02-151-1/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D30008 llvm-svn: 295260
* Re-apply r295110 and r295144 with a fix for the ASan issue.Peter Collingbourne2017-02-151-98/+156
| | | | llvm-svn: 295241
* [InstCombine] improve formatting; NFCSanjay Patel2017-02-151-6/+3
| | | | llvm-svn: 295237
* AddressSanitizer: don't track swifterror memory addressesArnold Schwaighofer2017-02-151-3/+12
| | | | | | | | | | | | | | They are register promoted by ISel and so it makes no sense to treat them as memory. Inserting calls to the thread sanitizer would also generate invalid IR. You would hit: "swifterror value can only be loaded and stored from, or as a swifterror argument!" llvm-svn: 295230
* ThreadSanitizer: don't track swifterror memory addressesArnold Schwaighofer2017-02-151-0/+7
| | | | | | | | | | | | | | They are register promoted by ISel and so it makes no sense to treat them as memory. Inserting calls to the thread sanitizer would also generate invalid IR. You would hit: "swifterror value can only be loaded and stored from, or as a swifterror argument!" llvm-svn: 295215
* Revert "[JumpThreading] Thread through guards"Anna Thomas2017-02-152-189/+15
| | | | | | | | | This reverts commit r294617. We fail on an assert while trying to get a condition from an unconditional branch. llvm-svn: 295200
* [InlineFunction] use getFunction(); NFCSanjay Patel2017-02-151-3/+3
| | | | llvm-svn: 295185
* [InlineFunction] use getCaller(); NFCISanjay Patel2017-02-151-3/+2
| | | | llvm-svn: 295181
* [InlineFunction] use range-for loop; NFCISanjay Patel2017-02-151-10/+8
| | | | llvm-svn: 295179
* Revert r295110 and r295144.Daniel Jasper2017-02-151-156/+98
| | | | | | | This fails under ASAN: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/798/steps/check-llvm%20asan/logs/stdio llvm-svn: 295162
* SimplifyCFG: Register cloned assume intrinsics with assumption cache when ↵Peter Collingbourne2017-02-151-3/+10
| | | | | | | | creating critical edge. Differential Revision: https://reviews.llvm.org/D29976 llvm-svn: 295145
* WholeProgramDevirt: Separate the code that applies optzns from the code that ↵Peter Collingbourne2017-02-151-48/+86
| | | | | | | | | | | decides whether to apply them. NFCI. The idea is that the apply* functions will also be called when importing devirt optimizations. Differential Revision: https://reviews.llvm.org/D29745 llvm-svn: 295144
* Fix a bug in caller's BFI update code after inlining.Easwaran Raman2017-02-141-3/+10
| | | | | | | | | | | | | | | | Multiple blocks in the callee can be mapped to a single cloned block since we prune the callee as we clone it. The existing code iterates over the value map and clones the block frequency (and eventually scales the frequencies of the cloned blocks). Value map's iteration is not deterministic and so the cloned block might get the frequency of any of the original blocks. The fix is to set the max of the original frequencies to the cloned block. The first block in the sequence must have this max frequency and, in the call context, subsequent blocks must have its frequency. Differential Revision: https://reviews.llvm.org/D29696 llvm-svn: 295115
* [LV] Rename Induction to PrimaryInduction. NFC.Michael Kuperstein2017-02-141-12/+12
| | | | llvm-svn: 295111
* WholeProgramDevirt: Change internal vcall data structures to match summary.Peter Collingbourne2017-02-141-74/+94
| | | | | | | | | | | | | | | | | | Group calls into constant and non-constant arguments up front, and use uint64_t instead of ConstantInt to represent constant arguments. The goal is to allow the information from the summary to fit naturally into this data structure in a future change (specifically, it will be added to CallSiteInfo). This has two side effects: - We disallow VCP for constant integer arguments of width >64 bits. - We remove the restriction that the bitwidth of a vcall's argument and return types must match those of the vfunc definitions. I don't expect either of these to matter in practice. The first case is uncommon, and the second one will lead to UB (so we can do anything we like). Differential Revision: https://reviews.llvm.org/D29744 llvm-svn: 295110
* [BasicBlockUtils] Use getFirstNonPHIOrDbg to set debugloc for instructions ↵Taewook Oh2017-02-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | created in SplitBlockPredecessors Summary: When setting debugloc for instructions created in SplitBlockPredecessors, current implementation copies debugloc from the first-non-phi instruction of the original basic block. However, if the first-non-phi instruction is a call for @llvm.dbg.value, the debugloc of the instruction may point the location outside of the block itself. For the example code of ``` 1 typedef struct _node_t { 2 struct _node_t *next; 3 } node_t; 4 5 extern node_t *root; 6 7 int foo() { 8 node_t *node, *tmp; 9 int ret = 0; 10 11 node = tmp = root->next; 12 while (node != root) { 13 while (node) { 14 tmp = node; 15 node = node->next; 16 ret++; 17 } 18 } 19 20 return ret; 21 } ``` , below is the basicblock corresponding to line 12 after Reassociate expressions pass: ``` while.cond: ; preds = %while.cond2, %entry %node.0 = phi %struct._node_t* [ %1, %entry ], [ null, %while.cond2 ] %ret.0 = phi i32 [ 0, %entry ], [ %ret.1, %while.cond2 ] tail call void @llvm.dbg.value(metadata i32 %ret.0, i64 0, metadata !19, metadata !20), !dbg !21 tail call void @llvm.dbg.value(metadata %struct._node_t* %node.0, i64 0, metadata !11, metadata !20), !dbg !31 %cmp = icmp eq %struct._node_t* %node.0, %0, !dbg !33 br i1 %cmp, label %while.end5, label %while.cond2, !dbg !35 ``` As you can see, the first-non-phi instruction is a call for @llvm.dbg.value, and the debugloc is ``` !21 = !DILocation(line: 9, column: 7, scope: !6) ``` , which is a definition of 'ret' variable and outside of the scope of the basicblock itself. However, current implementation picks up this debugloc for the instructions created in SplitBlockPredecessors. This patch addresses this problem by picking up debugloc from the first-non-phi-non-dbg instruction. Reviewers: dblaikie, samsonov, eugenis Reviewed By: eugenis Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29867 llvm-svn: 295106
* Re-apply "[profiling] Remove dead profile name vars after emitting name data"Vedant Kumar2017-02-141-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | This reverts 295092 (re-applies 295084), with a fix for dangling references from the array of coverage names passed down from frontends. I missed this in my initial testing because I only checked test/Profile, and not test/CoverageMapping as well. Original commit message: The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295099
* Revert "[profiling] Remove dead profile name vars after emitting name data"Vedant Kumar2017-02-141-3/+0
| | | | | | | | This reverts commit r295084. There is a test failure on: http://lab.llvm.org:8011/builders/clang-atom-d525-fedora-rel/builds/2620/ llvm-svn: 295092
* [profiling] Remove dead profile name vars after emitting name dataVedant Kumar2017-02-141-0/+3
| | | | | | | | | | | | | | | | The profile name variables passed to counter increment intrinsics are dead after we emit the finalized name data in __llvm_prf_nm. However, we neglect to erase these name variables. This causes huge size increases in the __TEXT,__const section as well as slowdowns when linker dead stripping is disabled. Some affected projects are so massive that they fail to link on Darwin, because only the small code model is supported. Fix the issue by throwing away the name constants as soon as we're done with them. Differential Revision: https://reviews.llvm.org/D29921 llvm-svn: 295084
* Do not apply redundant LastCallToStaticBonusTaewook Oh2017-02-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: As written in the comments above, LastCallToStaticBonus is already applied to the cost if Caller has only one user, so it is redundant to reapply the bonus here. If the only user is not a caller, TotalSecondaryCost will not be adjusted anyway because callerWillBeRemoved is false. If there's no caller at all, we don't need to care about TotalSecondaryCost because inliningPreventsSomeOuterInline is false. Reviewers: chandlerc, eraman Reviewed By: eraman Subscribers: haicheng, davidxl, davide, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D29169 llvm-svn: 295075
* Correct a typo, s/hosting/hoisting/Brian Cain2017-02-141-1/+1
| | | | llvm-svn: 295066
* Reapply "[LV] Extend trunc optimization to all IVs with constant integer steps"Matthew Simpson2017-02-141-10/+47
| | | | | | | | | | | This reapplies commit r294967 with a fix for the execution time regressions caught by the clang-cmake-aarch64-quick bot. We now extend the truncate optimization to non-primary induction variables only if the truncate isn't already free. Differential Revision: https://reviews.llvm.org/D29847 llvm-svn: 295063
OpenPOWER on IntegriCloud