summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* rename instcombine test file. NFCAnna Thomas2017-03-281-0/+0
| | | | llvm-svn: 298904
* [LV] Transform truncations of non-primary induction variablesMatthew Simpson2017-03-271-0/+45
| | | | | | | | | | | | The vectorizer tries to replace truncations of induction variables with new induction variables having the smaller type. After r295063, this optimization was applied to all integer induction variables, including non-primary ones. When optimizing the truncation of a non-primary induction variable, we still need to transform the new induction so that it has the correct start value. This should fix PR32419. Reference: https://bugs.llvm.org/show_bug.cgi?id=32419 llvm-svn: 298882
* [InstCombine] Avoid incorrect folding of select into phi nodes when incoming ↵Anna Thomas2017-03-271-0/+38
| | | | | | | | | | | | | | | | | | | | | | element is a vector type Summary: We are incorrectly folding selects into phi nodes when the incoming value of a phi node is a constant vector. This optimization is done in `FoldOpIntoPhi` when the select condition is a phi node with constant incoming values. Without the fix, we are miscompiling (i.e. incorrectly folding the select into the phi node) when the vector contains non-zero elements. This patch fixes the miscompile and we will correctly fold based on the select vector operand (see added test cases). Reviewers: majnemer, sanjoy, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31189 llvm-svn: 298845
* [LoopUnroll] Remap references in peeled iterationSerge Pavlov2017-03-261-0/+61
| | | | | | | | | References in cloned blocks must be remapped prior to dominator calculation. Differential Revision: https://reviews.llvm.org/D31281 llvm-svn: 298811
* Split the SimplifyCFG pass into two variants.Joerg Sonnenberger2017-03-266-12/+13
| | | | | | | | | | | | | | | | | | | | | | | The first variant contains all current transformations except transforming switches into lookup tables. The second variant contains all current transformations. The switch-to-lookup-table conversion results in code that is more difficult to analyze and optimize by other passes. Most importantly, it can inhibit Dead Code Elimination. As such it is often beneficial to only apply this transformation very late. A common example is inlining, which can often result in range restrictions for the switch expression. Changes in execution time according to LNT: SingleSource/Benchmarks/Misc/fp-convert +3.03% MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20% MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43% and a couple of smaller changes. For perimeter it also results 2.6% a smaller binary. Differential Revision: https://reviews.llvm.org/D30333 llvm-svn: 298799
* [IR] Make SwitchInst::CaseIt almost a normal iterator.Chandler Carruth2017-03-261-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This moves it to the iterator facade utilities giving it full random access semantics, etc. It can also now be used with standard algorithms like std::all_of and std::any_of and range adaptors like llvm::reverse. Also make the semantics of iterating match what every other iterator uses and forbid decrementing past the begin iterator. This was used as a hacky way to work around iterator invalidation. However, every instance trying to do this failed to actually avoid touching invalid iterators despite the clear documentation that the removed and all subsequent iterators become invalid including the end iterator. So I've added a return of the next iterator to removeCase and rewritten the loops that were doing this to correctly follow the iterator pattern of either incremneting or removing and assigning fresh values to the iterator and the end. In one case we were trying to go backwards to make this cleaner but it doesn't actually work. I've made that code match the code we use everywhere else to remove cases as we iterate. This changes the order of cases in one test output and I moved that test to CHECK-DAG so it wouldn't care -- the order isn't semantically meaningful anyways. llvm-svn: 298791
* Change the default attributes for llvm.prefetch to inaccessiblemem_or_argmemonlyEric Christopher2017-03-251-0/+34
| | | | | | | | so that we can perform some optimizations across it. Fixes PR32365 llvm-svn: 298781
* Revert r298620: [LV] Vectorize GEPsIvan Krasin2017-03-243-104/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reason: breaks linking Chromium with LLD + ThinLTO (a pass crashes) LLVM bug: https://bugs.llvm.org//show_bug.cgi?id=32413 Original change description: [LV] Vectorize GEPs This patch adds support for vectorizing GEPs. Previously, we only generated vector GEPs on-demand when creating gather or scatter operations. All GEPs from the original loop were scalarized by default, and if a pointer was to be stored to memory, we would have to build up the pointer vector with insertelement instructions. With this patch, we will vectorize all GEPs that haven't already been marked for scalarization. The patch refines collectLoopScalars to more exactly identify the scalar GEPs. The function now more closely resembles collectLoopUniforms. And the patch moves vector GEP creation out of vectorizeMemoryInstruction and into the main vectorization loop. The vector GEPs needed for gather and scatter operations will have already been generated before vectoring the memory accesses. Original Differential Revision: https://reviews.llvm.org/D30710 llvm-svn: 298735
* AMDGPU: Fold rcp/rsq of undef to undefMatt Arsenault2017-03-241-0/+18
| | | | llvm-svn: 298725
* [ThinLTO] Correct counting of functions in inliner statsTeresa Johnson2017-03-241-0/+3
| | | | | | | | | | | | Summary: Declarations need to be filtered out when counting functions. Reviewers: eraman Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31336 llvm-svn: 298720
* NewGVN: Fix PR32403 - Handling of undef in phis was not quite correctDaniel Berlin2017-03-241-0/+65
| | | | | | | due to LLVM's view of phi nodes. It would cause NewGVN not to fixpoint in some interesting edge cases. llvm-svn: 298687
* Set the prof weight correctly for call instructions in DeadArgumentElimination.Dehao Chen2017-03-231-0/+22
| | | | | | | | | | | | | | Summary: In DeadArgumentElimination, the call instructions will be replaced. We also need to set the prof weights so that function inlining can find the correct profile. Reviewers: eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31143 llvm-svn: 298660
* [MetaRenamer] Don't rename library functions.Bryant Wong2017-03-231-0/+15
| | | | | | | | | | | | | | | | | | | | Library functions can have specific semantics that affect the behavior of certain passes. DSE, for instance, gives special treatment to malloc-ed pointers but not to pointers returned from an equivalently typed (but differently named) function. MetaRenamer ought not to alter program semantics, so library functions must remain untouched. Reviewers: mehdi_amini, majnemer, chandlerc, davide Reviewed By: davide Subscribers: davide, llvm-commits Differential Revision: https://reviews.llvm.org/D31304 llvm-svn: 298659
* Use isFunctionHotInCallGraph to set the function section prefix.Dehao Chen2017-03-231-0/+22
| | | | | | | | | | | | | | Summary: The current prefix based function layout algorithm only looks at function's entry count, which is not sufficient. A function should be grouped together if its entry count or any call edge count is hot. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31225 llvm-svn: 298656
* [LV] Add regression test for r297610Gil Rapaport2017-03-231-0/+36
| | | | | | | | | The new test asserts that scalarized memory operations get memcheck metadata added even if the loop is only unrolled. Differential Revision: https://reviews.llvm.org/D30972 llvm-svn: 298641
* [ThinLTO] Add support for emitting minimized bitcode for thin linkTeresa Johnson2017-03-233-11/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The cumulative size of the bitcode files for a very large application can be huge, particularly with -g. In a distributed build environment, all of these files must be sent to the remote build node that performs the thin link step, and this can exceed size limits. The thin link actually only needs the summary along with a bitcode symbol table. Until we have a proper bitcode symbol table, simply stripping the debug metadata results in significant size reduction. Add support for an option to additionally emit minimized bitcode modules, just for use in the thin link step, which for now just strips all debug metadata. I plan to add a cc1 option so this can be invoked easily during the compile step. However, care must be taken to ensure that these minimized thin link bitcode files produce the same index as with the original bitcode files, as these original bitcode files will be used in the backends. Specifically: 1) The module hash used for caching is typically produced by hashing the written bitcode, and we want to include the hash that would correspond to the original bitcode file. This is because we want to ensure that changes in the stripped portions affect caching. Added plumbing to emit the same module hash in the minimized thin link bitcode file. 2) The module paths in the index are constructed from the module ID of each thin linked bitcode, and typically is automatically generated from the input file path. This is the path used for finding the modules to import from, and obviously we need this to point to the original bitcode files. Added gold-plugin support to take a suffix replacement during the thin link that is used to override the identifier on the MemoryBufferRef constructed from the loaded thin link bitcode file. The assumption is that the build system can specify that the minimized bitcode file has a name that is similar but uses a different suffix (e.g. out.thinlink.bc instead of out.o). Added various tests to ensure that we get identical index files out of the thin link step. Reviewers: mehdi_amini, pcc Subscribers: Prazek, llvm-commits Differential Revision: https://reviews.llvm.org/D31027 llvm-svn: 298638
* [LV] Vectorize GEPsMatthew Simpson2017-03-233-107/+104
| | | | | | | | | | | | | | | | | | | | | This patch adds support for vectorizing GEPs. Previously, we only generated vector GEPs on-demand when creating gather or scatter operations. All GEPs from the original loop were scalarized by default, and if a pointer was to be stored to memory, we would have to build up the pointer vector with insertelement instructions. With this patch, we will vectorize all GEPs that haven't already been marked for scalarization. The patch refines collectLoopScalars to more exactly identify the scalar GEPs. The function now more closely resembles collectLoopUniforms. And the patch moves vector GEP creation out of vectorizeMemoryInstruction and into the main vectorization loop. The vector GEPs needed for gather and scatter operations will have already been generated before vectoring the memory accesses. Differential Revision: https://reviews.llvm.org/D30710 llvm-svn: 298620
* [LV] Delete unneeded scalar GEP creation codeMatthew Simpson2017-03-232-3/+2
| | | | | | | | | | | | | The code for generating scalar base pointers in vectorizeMemoryInstruction is not needed. We currently scalarize all GEPs and maintain the scalarized values in VectorLoopValueMap. The GEP cloning in this unneeded code is the same as that in scalarizeInstruction. The test cases that changed as a result of this patch changed because we were able to reuse the scalarized GEP that we previously generated instead of cloning a new one. Differential Revision: https://reviews.llvm.org/D30587 llvm-svn: 298615
* Do not set branch weight if the branch weight annotation is present.Dehao Chen2017-03-231-1/+4
| | | | | | | | | | | | | | Summary: ThinLTO will annotate the CFG twice. If the branch weight is set by the first annotation, we should not set the branch weight again in the second annotation because the first annotation is more accurate as there is less optimization that could affect debug info accuracy. Reviewers: tejohnson, davidxl Reviewed By: tejohnson Subscribers: mehdi_amini, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D31228 llvm-svn: 298602
* Preserve nonnull metadata on Loads through SROA & mem2reg.Luqman Aden2017-03-222-0/+115
| | | | | | | | | | | | | | | | | Summary: https://llvm.org/bugs/show_bug.cgi?id=31142 : SROA was dropping the nonnull metadata on loads from allocas that got optimized out. This patch simply preserves nonnull metadata on loads through SROA and mem2reg. Reviewers: chandlerc, efriedma Reviewed By: efriedma Subscribers: hfinkel, spatel, efriedma, arielb1, davide, llvm-commits Differential Revision: https://reviews.llvm.org/D27114 llvm-svn: 298540
* [InstCombine] canonicalize insertelement of scalar constant ahead of ↵Sanjay Patel2017-03-224-22/+16
| | | | | | | | | | | | | | insertelement of variable insertelement (insertelement X, Y, IdxC1), ScalarC, IdxC2 --> insertelement (insertelement X, ScalarC, IdxC2), Y, IdxC1 As noted in the code comment and seen in the test changes, the motivation is that by pulling constant insertion up, we may be able to constant fold some insertelement instructions. Differential Revision: https://reviews.llvm.org/D31196 llvm-svn: 298520
* [ValueTracking] Make sure we keep range metadata information when ↵Craig Topper2017-03-221-0/+17
| | | | | | calculating known bits for calls to bitreverse intrinsic. llvm-svn: 298488
* [InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left ↵Craig Topper2017-03-221-0/+43
| | | | | | | | | | | | | | | | side of subtracts Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side. Reviewers: davide, majnemer, spatel, sanjoy, hfinkel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31119 llvm-svn: 298478
* AMDGPU: Mark all unspecified CC functions in tests as amdgpu_kernelMatt Arsenault2017-03-2135-214/+214
| | | | | | | | | | | | Currently the default C calling convention functions are treated the same as compute kernels. Make this explicit so the default calling convention can be changed to a non-kernel. Converted with perl -pi -e 's/define void/define amdgpu_kernel void/' on the relevant test directories (and undoing in one place that actually wanted a non-kernel). llvm-svn: 298444
* [InstCombine] regenerate checks; NFCSanjay Patel2017-03-211-55/+54
| | | | llvm-svn: 298432
* Let llvm.objectsize be conservative with null pointersGeorge Burgess IV2017-03-216-41/+141
| | | | | | | | | | | This adds a parameter to @llvm.objectsize that makes it return conservative values if it's given null. This fixes PR23277. Differential Revision: https://reviews.llvm.org/D28494 llvm-svn: 298430
* [InstCombine] auto-generate better checks; NFCSanjay Patel2017-03-212-80/+122
| | | | llvm-svn: 298377
* InstCombine: Check source value precision when reducing cast intrinsicMatt Arsenault2017-03-201-36/+405
| | | | | | Missed this check when porting from the libcall version. llvm-svn: 298312
* Add missing updated test from VN coercion changes. Instructions were ↵Daniel Berlin2017-03-201-3/+3
| | | | | | renamed. NFC llvm-svn: 298280
* Updates branch_weights annotation for call instructions during inlining.Dehao Chen2017-03-201-0/+39
| | | | | | | | | | | | | | Summary: Inliner should update the branch_weights annotation to scale it to proper value. Reviewers: davidxl, eraman Reviewed By: eraman Subscribers: zzheng, llvm-commits Differential Revision: https://reviews.llvm.org/D30767 llvm-svn: 298270
* [InstCombine] Use update_test_checks.py to regenerate a test. NFCCraig Topper2017-03-191-78/+78
| | | | llvm-svn: 298227
* Remove unused arguments. NFCIXin Tong2017-03-191-1/+1
| | | | llvm-svn: 298218
* [JumpThreading] Perform phi-translation in SimplifyPartiallyRedundantLoad.Xin Tong2017-03-191-0/+35
| | | | | | | | | | | | | | | | | | | Summary: In case we are loading on a phi-load in SimplifyPartiallyRedundantLoad. Try to phi translate it into incoming values in the predecessors before we search for available loads. This needs https://reviews.llvm.org/D30524 Reviewers: davide, sanjoy, efriedma, dberlin, rengolin Reviewed By: dberlin Subscribers: junbuml, llvm-commits Differential Revision: https://reviews.llvm.org/D30543 llvm-svn: 298217
* [Analysis] bitreverse(undef) returns undefBrian Gesiak2017-03-191-0/+14
| | | | | | | | | | | | | | | | Summary: The reverse of an artbitrary bitpattern is also an arbitrary bitpattern. Reviewers: trentxintong, arsenm, majnemer Reviewed By: majnemer Subscribers: majnemer, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D31118 llvm-svn: 298201
* NewGVN: Fix PHI evaluation bug exposed by new verifier. We were checking ↵Daniel Berlin2017-03-181-0/+60
| | | | | | whether the incoming block was reachable instead of whether the specific edge was reachable llvm-svn: 298187
* [PGO] Add omitted test cases.Rong Xu2017-03-171-0/+35
| | | | llvm-svn: 298115
* [PGO] Value profile for size of memory intrinsic callsRong Xu2017-03-172-0/+86
| | | | | | | | This patch annotates the valuesites profile to memory intrinsics. Differential Revision: http://reviews.llvm.org/D31002 llvm-svn: 298110
* Only unswitch loops with uniform conditionsStanislav Mekhanoshin2017-03-172-0/+87
| | | | | | | | | | | | | | | | | | Loop unswitching can be extremely harmful for a SIMT target. In case if hoisted condition is not uniform a SIMT machine will execute both clones of a loop sequentially. Therefor LoopUnswitch checks if the condition is non-divergent. Since DivergenceAnalysis adds an expensive PostDominatorTree analysis not needed for non-SIMT targets a new option is added to avoid unneded analysis initialization. The method getAnalysisUsage is called when TargetTransformInfo is not yet available and we cannot use it here. For that reason a new field DivergentTarget is added to PassManagerBuilder to control the behavior and set this field from a target. Differential Revision: https://reviews.llvm.org/D30796 llvm-svn: 298104
* [RSForGC] Handle vector GEPsSanjoy Das2017-03-171-0/+15
| | | | | | | | | We were not handling getelemenptr instructions of vector type before. Since getelemenptr instructions for vector types follow the same rule as getelementptr instructions for non-vector types, we can just handle them in the same way. llvm-svn: 298028
* Resubmit r297897: [PGO] Value profile for size of memory intrinsic callsRong Xu2017-03-161-2/+2
| | | | | | | R297897 inadvertently enabled annotation for memop profiling. This new patch fixed it. llvm-svn: 297996
* Salvage debug info from instructions about to be deletedAdrian Prantl2017-03-161-0/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Reapplies r297971 and punting on finding a better API for findDbgValues()] This patch improves debug info quality in InstCombine by looking at values that are about to be deleted, checking whether there are any dbg.value instrinsics referring to them, and potentially encoding the semantics of the deleted instruction into the dbg.value's DIExpression. In the example in the testcase (which was extracted from XNU) there is a sequence of %4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41 %5 = bitcast %struct.entry* %4 to i8*, !dbg !42 %add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43 %6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44 call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34 When these instructions are eliminated by instcombine one after another, we can still salvage the otherwise dead debug info: - Bitcasts have no effect, so have the dbg.value point to operand(0) - Loads can be expressed via a DW_OP_deref - Constant gep instructions can be replaced by DWARF expression arithmetic The API introduced by this patch is not specific to instcombine and can be useful in other places, too. rdar://problem/30725338 Differential Revision: https://reviews.llvm.org/D30919 llvm-svn: 297994
* [LoopUnroll] Don't peel loops where the latch isn't the exiting blockMichael Kuperstein2017-03-161-0/+36
| | | | | | | | | Peeling assumed this doesn't happen, but didn't check it. This fixes PR32178. Differential Revision: https://reviews.llvm.org/D30757 llvm-svn: 297993
* [InstCombine] avoid breaking up bitcasted vector min/max patterns (PR32306)Sanjay Patel2017-03-161-3/+4
| | | | | | | | As the related tests show, we're not canonicalizing to this form for scalars or vectors yet, but this solves the immediate problem in: https://bugs.llvm.org/show_bug.cgi?id=32306 llvm-svn: 297989
* [InstCombine] add tests for PR32306 and missed min/max canonicalization; NFCSanjay Patel2017-03-161-6/+78
| | | | llvm-svn: 297986
* Revert commit r297971 because of issues reported by msan.Adrian Prantl2017-03-161-106/+0
| | | | llvm-svn: 297982
* Salvage debug info from instructions about to be deletedAdrian Prantl2017-03-161-0/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch improves debug info quality in InstCombine by looking at values that are about to be deleted, checking whether there are any dbg.value instrinsics referring to them, and potentially encoding the semantics of the deleted instruction into the dbg.value's DIExpression. In the example in the testcase (which was extracted from XNU) there is a sequence of %4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41 %5 = bitcast %struct.entry* %4 to i8*, !dbg !42 %add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43 %6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44 call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34 When these instructions are eliminated by instcombine one after another, we can still salvage the otherwise dead debug info: - Bitcasts have no effect, so have the dbg.value point to operand(0) - Loads can be expressed via a DW_OP_deref - Constant gep instructions can be replaced by DWARF expression arithmetic The API introduced by this patch is not specific to instcombine and can be useful in other places, too. rdar://problem/30725338 Differential Revision: https://reviews.llvm.org/D30919 llvm-svn: 297971
* Revert "[PGO] Value profile for size of memory intrinsic calls"Eric Liu2017-03-161-2/+2
| | | | | | This commit reverts r297897 and r297909. llvm-svn: 297951
* [PM/Inliner] Fix a bug in r297374 where we would leave stale calls inChandler Carruth2017-03-161-0/+31
| | | | | | | the work queue and crash when trying to visit them after deleting the function containing those calls. llvm-svn: 297940
* [PM/Inliner] Add a test case that encapsulates the core issue addressedChandler Carruth2017-03-161-0/+460
| | | | | | | | | | | | | | | | in r297374. I've extracted a small version of this from the C++ metaprogram Richard came up with to exercise these kinds of issues and written comments to describe both how to reproduce a fresh version of the test case and what likely failure modes are. The test case is still a bit brittle as it depends on the particular inline cost modeling and SCC visitation order, but it definitely would have caught the bug right away when developing things so it seems a really valuable test case to have. llvm-svn: 297935
* [PGO] Value profile for size of memory intrinsic callsRong Xu2017-03-151-2/+2
| | | | | | | | | This patch adds the value profile support to profile the size parameter of memory intrinsic calls: memcpy, memcmp, and memmov. Differential Revision: http://reviews.llvm.org/D28965 llvm-svn: 297897
OpenPOWER on IntegriCloud