summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* Add missing includesDavid Blaikie2018-02-021-0/+3
| | | | llvm-svn: 324040
* [InstCombine] allow multi-use values in canEvaluate* if all uses are in 1 instSanjay Patel2018-02-011-5/+13
| | | | | | | | | | | | | | | | This is the enhancement suggested in D42536 to fix a shortcoming in regular InstCombine's canEvaluate* functionality. When we have multiple uses of a value, but they're all in one instruction, we can allow that expression to be narrowed or widened for the same cost as a single-use value. AFAICT, this can only matter for multiply: sub/and/or/xor/select would be simplified away if the operands are the same value; add becomes shl; shifts with a variable shift amount aren't handled. Differential Revision: https://reviews.llvm.org/D42739 llvm-svn: 324014
* Revert commit rL323951David Green2018-02-011-6/+2
| | | | | | | Looks like it's causing timeouts out on at least ppc64le buildbots. llvm-svn: 323959
* [InstCombine] Allow common type conversions to i8/i16/i32David Green2018-02-011-2/+6
| | | | | | | | | | | This, in instcombine, allows conversions to i8/i16/i32 (very common cases) even if the resulting type is not legal according to the data layout. This can often open up extra combine opportunities. Differential Revision: https://reviews.llvm.org/D42424 llvm-svn: 323951
* [LSR] Don't force bases of foldable formulae to the final type.Mikael Holmen2018-02-011-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Before emitting code for scaled registers, we prevent SCEVExpander from hoisting any scaled addressing mode by emitting all the bases first. However, these bases are being forced to the final type, resulting in some odd code. For example, if the type of the base is an integer and the final type is a pointer, we will emit an inttoptr for the base, a ptrtoint for the scale, and then a 'reverse' GEP where the GEP pointer is actually the base integer and the index is the pointer. It's more intuitive to use the pointer as a pointer and the integer as index. Patch by: Bevin Hansson Reviewers: atrick, qcolombet, sanjoy Reviewed By: qcolombet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42103 llvm-svn: 323946
* [GlobalOpt] Improve common case efficiency of static global initializer ↵Amara Emerson2018-01-311-2/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | evaluation For very, very large global initializers which can be statically evaluated, the code would create vectors of temporary Constants, modifying them in place, before committing the resulting Constant aggregate to the global's initializer value. This had effectively O(n^2) complexity in the size of the global initializer and would cause memory and non-termination issues compiling some workloads. This change performs the static initializer evaluation and creation in batches, once for each global in the evaluated IR memory. The existing code is maintained as a last resort when the initializers are more complex than simple values in a large aggregate. This should theoretically by NFC, no test as the example case is massive. The existing test cases pass with this, as well as the llvm test suite. To give an example, consider the following C++ code adapted from the clang regression tests: struct S { int n = 10; int m = 2 * n; S(int a) : n(a) {} }; template<typename T> struct U { T *r = &q; T q = 42; U *p = this; }; U<S> e; The global static constructor for 'e' will need to initialize 'r' and 'p' of the outer struct, while also initializing the inner 'q' structs 'n' and 'm' members. This batch algorithm will simply use general CommitValueTo() method to handle the complex nested S struct initialization of 'q', before processing the outermost members in a single batch. Using CommitValueTo() to handle member in the outer struct is inefficient when the struct/array is very large as we end up creating and destroy constant arrays for each initialization. For the above case, we expect the following IR to be generated: %struct.U = type { %struct.S*, %struct.S, %struct.U* } %struct.S = type { i32, i32 } @e = global %struct.U { %struct.S* gep inbounds (%struct.U, %struct.U* @e, i64 0, i32 1), %struct.S { i32 42, i32 84 }, %struct.U* @e } The %struct.S { i32 42, i32 84 } inner initializer is treated as a complex constant expression, while the other two elements of @e are "simple". Differential Revision: https://reviews.llvm.org/D42612 llvm-svn: 323933
* Utils: Fix DomTree update for entry blockMatt Arsenault2018-01-311-5/+14
| | | | | | | If SplitBlockPredecessors was used on a function entry block, it wouldn't update the dominator tree. llvm-svn: 323928
* [AggressiveInstCombine] Fixed TruncCombine class to handle TruncInst leaf ↵Amjad Aboud2018-01-311-4/+12
| | | | | | | | | | | node correctly. This covers the case where TruncInst leaf node is a constant expression. See PR36121 for more details. Differential Revision: https://reviews.llvm.org/D42622 llvm-svn: 323926
* [GlobalOpt] Fix exponential compile-time with selects.Eli Friedman2018-01-311-17/+16
| | | | | | | | | | | | | | | If you have a long chain of select instructions created from something like `int* p = &g; if (foo()) p += 4; if (foo2()) p += 4;` etc., a naive recursive visitor will recursively visit each select twice, which is O(2^N) in the number of select instructions. Use the visited set to cut off recursion in this case. (No testcase because this doesn't actually change the behavior, just the time.) Differential Revision: https://reviews.llvm.org/D42451 llvm-svn: 323910
* AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16}Marek Olsak2018-01-311-0/+12
| | | | | | | | | | Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908
* [SeparateConstOffsetFromGEP] Preserve metadata when splitting GEPsMarek Olsak2018-01-311-0/+2
| | | | | | | | | | | | | | Summary: !amdgpu.uniform needs to be preserved for AMDGPU, otherwise bad things happen. Reviewers: arsenm, nhaehnle, jingyue, broune, majnemer, bjarke.roune, dblaikie Subscribers: wdng, tpr, llvm-commits Differential Revision: https://reviews.llvm.org/D42744 llvm-svn: 323907
* [InstCombine] reduce code duplication for canEvaluate* functions; NFCISanjay Patel2018-01-311-47/+43
| | | | | | We'd have to make the change suggested in D42536 3x otherwise. llvm-svn: 323877
* [AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.Amjad Aboud2018-01-313-5/+19
| | | | | | | | | Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis. See PR36134 for more details. Differential Revision: https://reviews.llvm.org/D42683 llvm-svn: 323862
* LTO: Drop comdats when converting definitions to declarations.Peter Collingbourne2018-01-311-0/+2
| | | | | | Differential Revision: https://reviews.llvm.org/D42715 llvm-svn: 323844
* Teach ValueMapper to use ODR uniqued types when availableTeresa Johnson2018-01-301-4/+15
| | | | | | | | | | | | | | | | | Summary: This is exposed during ThinLTO compilation, when we import an alias by creating a clone of the aliasee. Without this fix the debug type is unnecessarily cloned and we get a duplicate, undoing the uniquing. Fixes PR36089. Reviewers: mehdi_amini, pcc Subscribers: eraman, JDevlieghere, llvm-commits Differential Revision: https://reviews.llvm.org/D41669 llvm-svn: 323813
* Re-commit : [PowerPC] Add handling for ColdCC calling convention and a pass ↵Zaara Syeda2018-01-301-6/+158
| | | | | | | | | | | | | | | | | | | | | to mark candidates with coldcc attribute. This recommits r322721 reverted due to sanitizer memory leak build bot failures. Original commit message: This patch adds support for the coldcc calling convention for Power. This changes the set of non-volatile registers. It includes a pass to stress test the implementation by marking all static directly called functions with the coldcc attribute through the option -enable-coldcc-stress-test. It also includes an option, -ppc-enable-coldcc, to add the coldcc attribute to functions which are cold at all call sites based on BlockFrequencyInfo when the containing function does not call any non cold functions. Differential Revision: https://reviews.llvm.org/D38413 llvm-svn: 323778
* [RS4GC] Handle call/invoke instructions as base defining values of vectorsDaniel Neilson2018-01-301-0/+6
| | | | | | | | | | | | | | | | | | Summary: There's an asymmetry in the definitions of findBaseDefiningValueOfVector() and findBaseDefiningValue() of RS4GC. The later handles call and invoke instructions, and the former does not. This appears to be simple oversight. This patch remedies the oversight by adding the call and invoke cases to findBaseDefiningValueOfVector(). Reviewers: DaniilSuchkov, anna Reviewed By: anna Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42653 llvm-svn: 323764
* [DSE] make sure memory is not modified before partial store merging (PR36129)Sanjay Patel2018-01-301-1/+2
| | | | | | | | | | | | We missed a critical check in D30703. We must make sure that no intermediate store is sitting between the stores that we want to merge. This should fix: https://bugs.llvm.org/show_bug.cgi?id=36129 Differential Revision: https://reviews.llvm.org/D42663 llvm-svn: 323759
* [JumpThreading][NFC] Rename LoadInst variablesBrian M. Rzycki2018-01-291-43/+46
| | | | | | | | | | | | | | | | | | Summary: The JumpThreading pass has several locations where to the variable name LI refers to a LoadInst type. This is confusing and inhibits the ability to use LI for LoopInfo as a member of the JumpThreading class. Minor formatting and comments were also altered to reflect this change. Reviewers: dberlin, kuba, spop, sebpop Reviewed by: sebpop Subscribers: sebpop, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D42601 llvm-svn: 323695
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-291-133/+382
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323662
* [ThinLTO] - Stop internalizing and drop non-prevailing symbols.George Rimar2018-01-291-20/+30
| | | | | | | | | | | Implementation marks non-prevailing symbols as not live in the summary. Then them are dropped in backends. Fixes https://bugs.llvm.org/show_bug.cgi?id=35938 Differential revision: https://reviews.llvm.org/D42107 llvm-svn: 323633
* [CVP] Don't Replace incoming values from unreachable blocks with undef.Davide Italiano2018-01-291-24/+4
| | | | | | | | | | This pretty much reverts r322006, except that we keep the test, because we work around the issue exposed in a different way (a recursion limit in value tracking). There's still probably some sequence that exposes this problem, and the proper way to fix that for somebody who has time is outlined in the code review. llvm-svn: 323630
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-292-2/+2
| | | | | | "to to" -> "to" llvm-svn: 323628
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-271-369/+131
| | | | | | | | as shuffle." This reverts commit r323530 to fix possible problems in users code. llvm-svn: 323581
* Revert "[SLP] Removed the warning about unused variable, NFC."Alexey Bataev2018-01-271-1/+1
| | | | | | This reverts commit r323533 to fix possible problems in users code. llvm-svn: 323580
* [InstrProfiling] Don't exit early when an unused intrinsic is foundVedant Kumar2018-01-271-3/+6
| | | | | | This fixes a think-o in r323574. llvm-svn: 323576
* [InstrProfiling] Improve compile time when there is no workVedant Kumar2018-01-261-2/+21
| | | | | | | When there are no uses of profiling intrinsics in a module, and there's no coverage data to lower, InstrProfiling has no work to do. llvm-svn: 323574
* [InstCombine] Preserve debug values for eliminable castsVedant Kumar2018-01-261-1/+15
| | | | | | | | | | | | | | | | | A cast from A to B is eliminable if its result is casted to C, and if the pair of casts could just be expressed as a single cast. E.g here, %c1 is eliminable: %c1 = zext i16 %A to i32 %c2 = sext i32 %c1 to i64 InstCombine optimizes away eliminable casts. This patch teaches it to insert a dbg.value intrinsic pointing to the final result, so that local variables pointing to the eliminable result are preserved. Differential Revision: https://reviews.llvm.org/D42566 llvm-svn: 323570
* [SLP] Removed the warning about unused variable, NFC.Alexey Bataev2018-01-261-1/+1
| | | | llvm-svn: 323533
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-261-131/+369
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323530
* [AMDGPU] fix LDS f32 intrinsicsDaniil Fukalov2018-01-261-6/+6
| | | | | | | | | | | | - using qualified pointer addrspace in intrinsics class to avoid .f32 mangling - changed too common atomic mangling to ds - added missing intrinsics to AMDGPUTTIImpl::getTgtMemIntrinsic Reviewed by: b-sumner Differential Revision: https://reviews.llvm.org/D42383 llvm-svn: 323516
* [CallSiteSplitting] Fix infinite loop when recording conditions.Florian Hahn2018-01-261-1/+2
| | | | | | | | | Fix infinite loop when recording conditions by correctly marking basic blocks as visited. Fixes https://bugs.llvm.org/show_bug.cgi?id=36105 llvm-svn: 323515
* [NFC] fix trivial typos in comments and documentsHiroshi Inoue2018-01-263-3/+3
| | | | | | "in in" -> "in", "on on" -> "on" etc. llvm-svn: 323508
* [Debug] LCSSA: Insert dbg.value at the first available insertion pointVedant Kumar2018-01-251-1/+3
| | | | | | | | | | | Inserting a dbg.value instruction at the start of a basic block with a landingpad instruction triggers a verifier failure. We should be OK if we insert the instruction a bit later. Speculative fix for the bot failure described here: https://reviews.llvm.org/D42551 llvm-svn: 323482
* [SyntheticCounts] Rewrite the code using only graph traits.Easwaran Raman2018-01-251-6/+17
| | | | | | | | | | | | | | | | | | | Summary: The intent of this is to allow the code to be used with ThinLTO. In Thinlink phase, a traditional Callgraph can not be computed even though all the necessary information (nodes and edges of a call graph) is available. This is due to the fact that CallGraph class is closely tied to the IR. This patch first extends GraphTraits to add a CallGraphTraits graph. This is then used to implement a version of counts propagation on a generic callgraph. Reviewers: davidxl Subscribers: mehdi_amini, tejohnson, llvm-commits Differential Revision: https://reviews.llvm.org/D42311 llvm-svn: 323475
* [Debug] Add dbg.value intrinsics for PHIs created during LCSSA.Vedant Kumar2018-01-251-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is an enhancement to propagate dbg.value information when Phis are created on behalf of LCSSA. I noticed a case where a value carried across a loop was reported as <optimized out>. Specifically this case: int bar(int x, int y) { return x + y; } int foo(int size) { int val = 0; for (int i = 0; i < size; ++i) { val = bar(val, i); // Both val and i are correct } return val; // <optimized out> } In the above case, after all of the interesting computation completes our value is reported as "optimized out." This change will add a dbg.value to correct this. This patch also moves the dbg.value insertion routine from LoopRotation.cpp into Local.cpp, so that we can share it in both places (LoopRotation and LCSSA). Patch by Matt Davis! Differential Revision: https://reviews.llvm.org/D42551 llvm-svn: 323472
* [Debug] Add a utility to propagate dbg.value to new PHIs, NFCVedant Kumar2018-01-252-33/+39
| | | | | | | | | | This simply moves an existing utility to Utils for reuse. Split out of: https://reviews.llvm.org/D42551 Patch by Matt Davis! llvm-svn: 323471
* [asan] Fix kernel callback naming in instrumentation module.Evgeniy Stepanov2018-01-251-3/+1
| | | | | | | | | | Right now clang uses "_n" suffix for some user space callbacks and "N" for the matching kernel ones. There's no need for this and it actually breaks kernel build with inline instrumentation. Use the same callback names for user space and the kernel (and also make them consistent with the names GCC uses). Patch by Andrey Konovalov. Differential Revision: https://reviews.llvm.org/D42423 llvm-svn: 323470
* Re-land "[ThinLTO] Add call edges' relative block frequency to per-module ↵Easwaran Raman2018-01-251-2/+3
| | | | | | | | | | | | | | summary." It was reverted after buildbot regressions. Original commit message: This allows relative block frequency of call edges to be passed to the thinlink stage where it will be used to compute synthetic entry counts of functions. llvm-svn: 323460
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-251-350/+129
| | | | | | | | as shuffle." This reverts commit r323441 to fix buildbots. llvm-svn: 323447
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-251-129/+350
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323441
* [InstCombine] narrow masked zexted binops (PR35792)Sanjay Patel2018-01-252-0/+71
| | | | | | | | | | | | | | | | | This is guarded by shouldChangeType(), so the tests show that we don't do the fold if the narrower type is not legal. Note that there is a proposal (D42424) that would change the results for the specific cases shown in these tests. That difference is also discussed in PR35792: https://bugs.llvm.org/show_bug.cgi?id=35792 Alive proofs for the cases handled here as well as the bitwise logic binops that we should already do better on: https://rise4fun.com/Alive/c97 https://rise4fun.com/Alive/Lc5E https://rise4fun.com/Alive/kdf llvm-svn: 323437
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-251-352/+130
| | | | | | | | as shuffle." This reverts commit r323430 to fix buildbots. llvm-svn: 323432
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-251-130/+352
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323430
* Another try to commit 323321 (aggressive instruction combine).Amjad Aboud2018-01-2510-3/+674
| | | | llvm-svn: 323416
* [GlobalOpt] Emit fragments using field offsets from struct layoutMikael Holmen2018-01-251-4/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: When creating the debug fragments for a SRA'd struct, use the fields' offsets, taken from the struct layout, as the offsets for the resulting fragments. This fixes an issue where GlobalOpt would emit fragments with incorrect offsets for padded fields. This should solve PR36016. Patch by David Stenberg. Reviewers: aprantl Reviewed By: aprantl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D42489 llvm-svn: 323411
* Revert "[SLP] Fix for PR32086: Count InsertElementInstr of the same elements ↵Alexey Bataev2018-01-241-346/+122
| | | | | | | | as shuffle." This reverts commit r323348 because of the broken buildbots. llvm-svn: 323359
* Revert r321751, "StructurizeCFG: Fix broken backedge detection"Nicolai Haehnle2018-01-241-28/+82
| | | | | | | | | | | It causes regressions in various OpenGL test suites. Keep the test cases introduced by r321751 as XFAIL, and add a test case for the regression. Change-Id: I90b4cc354f68cebe5fcef1f2422dc8fe1c6d3514 Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=36015 llvm-svn: 323355
* [SLP] Fix for PR32086: Count InsertElementInstr of the same elements as shuffle.Alexey Bataev2018-01-241-122/+346
| | | | | | | | | | | | | | | | | Summary: If the same value is going to be vectorized several times in the same tree entry, this entry is considered to be a gather entry and cost of this gather is counter as cost of InsertElementInstrs for each gathered value. But we can consider these elements as ShuffleInstr with SK_PermuteSingle shuffle kind. Reviewers: spatel, RKSimon, mkuper, hfinkel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38697 llvm-svn: 323348
* Reverted 323321.Amjad Aboud2018-01-2410-671/+3
| | | | llvm-svn: 323326
OpenPOWER on IntegriCloud