summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Transforms
Commit message (Collapse)AuthorAgeFilesLines
...
* [LoopVectorize] Don't preserve nsw/nuw flags on shrunken ops.George Burgess IV2017-06-091-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we're shrinking a binary operation, it may be the case that the new operations wraps where the old didn't. If this happens, the behavior should be well-defined. So, we can't always carry wrapping flags with us when we shrink operations. If we do, we get incorrect optimizations in cases like: void foo(const unsigned char *from, unsigned char *to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] - 128; } which gets optimized to: void foo(const unsigned char *from, unsigned char *to, int n) { for (int i = 0; i < n; i++) to[i] = from[i] | 128; } Because: - InstCombine turned `sub i32 %from.i, 128` into `add nuw nsw i32 %from.i, 128`. - LoopVectorize vectorized the add to be `add nuw nsw <16 x i8>` with a vector full of `i8 128`s - InstCombine took advantage of the fact that the newly-shrunken add "couldn't wrap", and changed the `add` to an `or`. InstCombine seems happy to figure out whether we can add nuw/nsw on its own, so I just decided to drop the flags. There are already a number of places in LoopVectorize where we rely on InstCombine to clean up. llvm-svn: 305053
* Inliner: Don't touch indirect callsDavid Blaikie2017-06-091-4/+4
| | | | | | | | | Other comments/implications are that this isn't intended behavior (nor perserved/reimplemented in the new inliner) & complicates fixing the 'inlining' of trivially dead calls without consulting the cost function first. llvm-svn: 305052
* [InstCombine] Pass a proper context instruction to all of the calls into ↵Craig Topper2017-06-099-45/+66
| | | | | | | | | | | | | | | | InstSimplify Summary: This matches the behavior we already had for compares and makes us consistent everywhere. Reviewers: dberlin, hfinkel, spatel Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33604 llvm-svn: 305049
* [CFI] Remove LinkerSubsectionsViaSymbols.Evgeniy Stepanov2017-06-081-23/+12
| | | | | | | | | | | | Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. This is the second attempt to land this change after fixing PR33316. llvm-svn: 305031
* [ExtractGV] Fix the doxygen comment on the constructor and the class to ↵Craig Topper2017-06-081-6/+6
| | | | | | refer to global values instead of functions. While there fix an 80 column violation. NFC llvm-svn: 305030
* Write summaries for merged modules when splitting modules for ThinLTO.Peter Collingbourne2017-06-081-2/+10
| | | | | | | | | This is to prepare to allow for dead stripping of globals in the merged modules. Differential Revision: https://reviews.llvm.org/D33921 llvm-svn: 305027
* [sanitizer-coverage] one more flavor of coverage: ↵Kostya Serebryany2017-06-081-9/+43
| | | | | | -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. Reapplying revisions 304630, 304631, 304632, 304673, see PR33308 llvm-svn: 305026
* Do not early-inline recursive calls in sample profile loader.Dehao Chen2017-06-081-0/+7
| | | | | | | | | | | | | | Summary: Early-inlining of recursive call makes the code size bloat exponentially. We should not disable it. Reviewers: davidxl, dnovillo, iteratee Reviewed By: iteratee Subscribers: iteratee, llvm-commits, sanjoy Differential Revision: https://reviews.llvm.org/D34017 llvm-svn: 305009
* Changed a comparison operator for std::stable_sort to implement strict weak ↵Galina Kistanova2017-06-081-3/+3
| | | | | | | | | ordering. This is a temporarily fix which needs additional work, as it triggers a test3 failure. test3 is commented out till then. llvm-svn: 304993
* InferAddressSpaces: Avoid assertion failure with replacing identicalNirav Dave2017-06-081-0/+7
| | | | | | | | | | | | | | | cloned constexpr Have cloneConstantExprWithNewAddressSpaces return nullptr when returning initial ConstantExpr. Reviewers: arsenm Subscribers: jholewinski, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D33995 llvm-svn: 304975
* [BPI] Don't assume that strcmp returning >0 is more likely than <0John Brawn2017-06-081-2/+2
| | | | | | | | | | | | | | | | | | | | The zero heuristic assumes that integers are more likely positive than negative, but this also has the effect of assuming that strcmp return values are more likely positive than negative. Given that for nonzero strcmp return values it's the ordering of arguments that determines the sign of the result there's no reason to assume that's true. Fix this by inspecting the LHS of the compare and using TargetLibraryInfo to decide if it's strcmp-like, and if so only assume that nonzero is more likely than zero i.e. strings are more often different than the same. This causes a slight code generation change in the spec2006 benchmark 403.gcc, but with no noticeable performance impact. The intent of this patch is to allow better optimisation of dhrystone on Cortex-M cpus, but currently it won't as there are also some changes that need to be made to if-conversion. Differential Revision: https://reviews.llvm.org/D33934 llvm-svn: 304970
* [InstCombine] fold lshr (sext X), C1 --> zext (lshr X, C2)Sanjay Patel2017-06-071-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was discussed in D33338. We have larger pattern-matching ending in a truncate that we can reduce or remove by handling these smaller patterns first. Further motivation is that narrower shift ops are easier for value tracking and zext is better than sext. http://rise4fun.com/Alive/rhh Name: boolshift %sext = sext i1 %x to i8 %r = lshr i8 %sext, 7 => %r = zext i1 %x to i8 Name: noboolshift %sext = sext i3 %x to i8 %r = lshr i8 %sext, 7 => %sh = lshr i3 %x, 2 %r = zext i3 %sh to i8 Differential Revision: https://reviews.llvm.org/D33879 llvm-svn: 304939
* Fix builin_expect lowering bugXinliang David Li2017-06-071-1/+3
| | | | | | | | PR33346 Skip cases when expected value is not constant int. llvm-svn: 304933
* LowerTypeTests: Generate simpler IR for br(llvm.type.test, then, else).Peter Collingbourne2017-06-071-2/+19
| | | | | | | | | | | | | This makes it so that the code quality for CFI checks when compiling with -O2 and linking with --lto-O0 is similar to that of the rest of the code. Reduces the size of a chrome binary built with -O2/--lto-O0 by about 750KB. Differential Revision: https://reviews.llvm.org/D33925 llvm-svn: 304921
* [InstCombine][InstSimplify] Use APInt::isNullValue/isOneValue to reduce ↵Craig Topper2017-06-078-57/+63
| | | | | | | | compiled code for comparing APInts with 0 and 1. NFC These methods are specifically optimized to only counting leading zeros without an additional uint64_t compare. llvm-svn: 304876
* [InstCombine] Fix two asserts that were accidentally checking that an APInt ↵Craig Topper2017-06-071-2/+2
| | | | | | | | pointer is non-zero instead of checking that the APInt self is non-zero. I believe this code used to use APInt references which would have worked. But then they were changed to pointers to allow m_APInt to be used. llvm-svn: 304875
* Move Object format code to lib/BinaryFormat.Zachary Turner2017-06-071-1/+1
| | | | | | | | | | | | This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864
* Fix PR23384 (part 3 of 3)Evgeny Stupachenko2017-06-061-1/+1
| | | | | | | | | | | | | Summary: The patch makes instruction count the highest priority for LSR solution for X86 (previously registers had highest priority). Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30562 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304824
* NewGVN: Fix PR/33187. This is a bug caused by two things:Daniel Berlin2017-06-061-32/+48
| | | | | | | | | | | | | 1. When there is no perfect iteration order, we can't let phi nodes put themselves in terms of things that come later in the iteration order, or we will endlessly cycle (the normal RPO algorithm clears the hashtable to avoid this issue). 2. We are sometimes erasing the wrong expression (causing pessimism) because our equality says loads and stores are the same. We introduce an exact equality function and use it when erasing to make sure we erase only identical expressions, not equivalent ones. llvm-svn: 304807
* [Atomics][LoopIdiom] Recognize unordered atomic memcpyAnna Thomas2017-06-061-15/+62
| | | | | | | | | | | | | | | | | | | | | | Summary: Expanding the loop idiom test for memcpy to also recognize unordered atomic memcpy. The only difference for recognizing an unordered atomic memcpy and instead of a normal memcpy is that the loads and/or stores involved are unordered atomic operations. Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html Patch by Daniel Neilson! Reviewers: reames, anna, skatkov Reviewed By: reames, anna Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33243 llvm-svn: 304806
* [IRCE] Canonicalize pre/post loops after the blocks are added into parent loopAnna Thomas2017-06-061-13/+20
| | | | | | | | | | | | | | | | | | | | | Summary: We were canonizalizing the pre loop (into loop-simplify form) before the post loop blocks were added into parent loop. This is incorrect when IRCE is done on a subloop. The post-loop blocks are created, but not yet added to the parent loop. So, loop-simplification on the pre-loop incorrectly updates LoopInfo. This patch corrects the ordering so that pre and post loop blocks are added to parent loop (if any), and then the loops are canonicalized to LCSSA and LoopSimplifyForm. Reviewers: reames, sanjoy, apilipenko Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33846 llvm-svn: 304800
* Sort the remaining #include lines in include/... and lib/....Chandler Carruth2017-06-0678-124/+121
| | | | | | | | | | | | | | | | | | | | | | | | | I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is *entirely* mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787
* Add a dominanance check interface that uses caching for instructions within ↵Xin Tong2017-06-062-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | same basic block. Summary: This problem stems from the fact that instructions are allocated using new in LLVM, i.e. there is no relationship that can be derived by just looking at the pointer value. This interface dispatches to appropriate dominance check given 2 instructions, i.e. in case the instructions are in the same basic block, ordered basicblock (with instruction numbering and caching) are used. Otherwise, dominator tree is used. This is a preparation patch for https://reviews.llvm.org/D32720 Reviewers: dberlin, hfinkel, davide Subscribers: davide, mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D33380 llvm-svn: 304764
* Fix PR23384 (part 2 of 3) NFCEvgeny Stupachenko2017-06-051-72/+68
| | | | | | | | | | | | Summary: The patch moves LSR cost comparison to target part. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D30561 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304750
* LSR: Calculate instruction cost only if InsnsCost is set to true (NFC)Evgeny Stupachenko2017-06-051-14/+21
| | | | | | | | | | | | | | Summary: The patch guard all instruction cost calculations with InsnCosts (-lsr-insns-cost) option. Currently even if the option set to false we calculate and print (in debug mode) instruction costs. Reviewers: qcolombet Differential Revision: http://reviews.llvm.org/D33914 From: Evgeny Stupachenko <evstupac@gmail.com> llvm-svn: 304746
* [InstCombine] Fix extractelement use before defSven van Haastregt2017-06-051-1/+1
| | | | | | | | | | | | This fixes a bug that can cause extractelements with operands that haven't been defined yet to be inserted at a wrong point when optimising insertelements. Patch by Karl Hylen. Differential Revision: https://reviews.llvm.org/D33449 llvm-svn: 304701
* Revert "[sanitizer-coverage] one more flavor of coverage: ↵Renato Golin2017-06-051-43/+9
| | | | | | | | -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet." This reverts commit r304630, as it broke ARM/AArch64 bots for 2 days. llvm-svn: 304698
* [LV] Make scalarizeInstruction() non-virtual. NFC.Ayal Zaks2017-06-041-2/+1
| | | | | | | | | | Following the request made in https://reviews.llvm.org/D32871, scalarizeInstruction() which is no longer overridden by InnerLoopUnroller is hereby made non-virtual in InnerLoopVectorizer. Should have been part of r297580 originally. llvm-svn: 304685
* [InstCombine] Add support for simplifying ctlz/cttz intrinsics based on ↵Craig Topper2017-06-031-5/+1
| | | | | | known bits. llvm-svn: 304669
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-06-031-0/+1
| | | | llvm-svn: 304638
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-06-031-0/+1
| | | | llvm-svn: 304637
* Added LLVM_FALLTHROUGH to address warning: this statement may fall through. NFC.Galina Kistanova2017-06-032-0/+3
| | | | llvm-svn: 304636
* [sanitizer-coverage] one more flavor of coverage: ↵Kostya Serebryany2017-06-031-9/+43
| | | | | | -fsanitize-coverage=inline-8bit-counters. Experimental so far, not documenting yet. llvm-svn: 304630
* Revert "[CFI] Remove LinkerSubsectionsViaSymbols."Evgeniy Stepanov2017-06-031-12/+23
| | | | | | This reverts commit r304582: breaks cfi-devirt :: anon-namespace.cpp on Darwin. llvm-svn: 304626
* [SLP] Improve comments and naming of functions/variables/members, NFC.Alexey Bataev2017-06-031-91/+59
| | | | | | | | | Fixed some comments, added an additional description of the algorithms, improved readability of the code. Differential revision: https://reviews.llvm.org/D33320 llvm-svn: 304616
* [sanitizer-coverage] refactor the code to make it easier to add more ↵Kostya Serebryany2017-06-021-55/+74
| | | | | | sections in future. NFC llvm-svn: 304610
* Revert "[SLP] Improve comments and naming of functions/variables/members, NFC."Alexey Bataev2017-06-021-59/+91
| | | | | | This reverts commit 6e311de8b907aa20da9a1a13ab07c3ce2ef4068a. llvm-svn: 304609
* [Statepoint] Be consistent about using deopt naming [NFCI]Philip Reames2017-06-021-3/+3
| | | | | | We'd called this "vm state" in the early days, but have long since standardized on calling it "deopt" in line with the operand bundle tag. Fix a few cases we'd missed. llvm-svn: 304607
* Fix debug build test failureXinliang David Li2017-06-021-2/+3
| | | | llvm-svn: 304600
* [PartialInlining] Minor cost anaysis tuningXinliang David Li2017-06-021-9/+56
| | | | | | Also added a test option and 2 cost analysis related tests. llvm-svn: 304599
* FunctionAttrs: Skip it if the effective SCC (ignoring optnone functions) is ↵David Blaikie2017-06-021-0/+4
| | | | | | | | | empty Minor optimization but mostly simplifies my debugging so I'm not dealing with empty SCCNodeSets while investigating issues in this optimization. llvm-svn: 304597
* [SLP] Improve comments and naming of functions/variables/members, NFC.Alexey Bataev2017-06-021-91/+59
| | | | | | | | | | | | | | Summary: Fixed some comments, added an additional description of the algorithms, improved readability of the code. Reviewers: anemet Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D33320 llvm-svn: 304593
* [SROA] Fix crash due to bad bitcastKeno Fischer2017-06-021-3/+4
| | | | | | | | | | | | | | Summary: As shown in the test case, SROA was crashing when trying to split stores (to the alloca) of loads (from anywhere), because it assumed the pointer operand to the loads and stores had to have the same address space. This isn't the case. Make sure to use the correct pointer type for both the load and the store. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D32593 llvm-svn: 304585
* [CFI] Remove LinkerSubsectionsViaSymbols.Evgeniy Stepanov2017-06-021-23/+12
| | | | | | | | | | Since D17854 LinkerSubsectionsViaSymbols is unnecessary. It is interfering with ThinLTO implementation of CFI-ICall, where the aliases used on the !LinkerSubsectionsViaSymbols branch are needed to export jump tables to ThinLTO backends. llvm-svn: 304582
* Skip CFI for dead functions.Evgeniy Stepanov2017-06-021-2/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D33805 llvm-svn: 304578
* [InstCombine] fix icmp with not op and constant to work with splat vector ↵Sanjay Patel2017-06-021-3/+3
| | | | | | constant llvm-svn: 304562
* [InstCombine] improve perf by not creating a known non-canonical instructionSanjay Patel2017-06-021-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Op1 (RHS) is a constant, so putting it on the LHS makes us churn through visitICmp an extra time to canonicalize it: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp sgt i8 -43, %x IC: ADD: %cmp = icmp sgt i8 -43, %x IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 -43, %x IC: Mod = %cmp = icmp sgt i8 -43, %x New = %cmp = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp If we create the swapped ICmp directly, we go faster: INSTCOMBINE ITERATION #1 on cmpnot IC: ADDING: 3 instrs to worklist IC: Visiting: %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp sgt i8 %notx, 42 IC: Old = %cmp = icmp sgt i8 %notx, 42 New = <badref> = icmp slt i8 %x, -43 IC: ADD: %cmp = icmp slt i8 %x, -43 IC: ERASE %1 = icmp sgt i8 %notx, 42 IC: ADD: %notx = xor i8 %x, -1 IC: DCE: %notx = xor i8 %x, -1 IC: ERASE %notx = xor i8 %x, -1 IC: Visiting: %cmp = icmp slt i8 %x, -43 IC: Visiting: ret i1 %cmp llvm-svn: 304558
* [coroutines] PR33271: Remove stray coro.save intrinsics during CoroSplitGor Nishanov2017-06-021-0/+12
| | | | | | | | | | | | | | | | | | | | | | | Summary: Optimization passes may remove llvm.coro.suspend intrinsic while leaving matching llvm.coro.save intrinsic orphaned. Make sure we clean up orphaned coro.saves. The bug manifested with a crash similar to this: ``` llvm_unreachable("Unknown type!"); llvm::MVT::getVT (Ty=0x489518, HandleUnknown=false) llvm::EVT::getEVT llvm::TargetLoweringBase::getValueType llvm::ComputeValueVTs llvm::SelectionDAGBuilder::visitTargetIntrinsic ``` Reviewers: GorNishanov Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D33817 llvm-svn: 304518
* [Profile] Enhance expect lowering to handle correlated branchesXinliang David Li2017-06-021-0/+148
| | | | | | | | | builtin_expect applied on && or || expressions were not handled properly before. With this patch, the problem is fixed. Differential Revision: http://reviews.llvm.org/D33164 llvm-svn: 304517
* [RS4GC] Comment clarificationPhilip Reames2017-06-021-2/+2
| | | | llvm-svn: 304514
OpenPOWER on IntegriCloud