summaryrefslogtreecommitdiffstats
path: root/llvm/test/Transforms
Commit message (Collapse)AuthorAgeFilesLines
* [LV] Remove triples from target-independent vectorizer tests. NFC.Michael Kuperstein2016-10-0668-68/+0
| | | | | | | | | Vectorizer tests in the target-independent directory should not have a target triple. If a test really needs to query a specific backend, it belongs in the right target subdirectory (which "REQUIRES" the right backend). Otherwise, it should not specify a triple. llvm-svn: 283512
* [PGO] Create weak alias for the renamed Comdat functionRong Xu2016-10-061-0/+6
| | | | | | | | | | Add a weak alias to the renamed Comdat function in IR level instrumentation, using it's original name. This ensures the same behavior w/ and w/o IR instrumentation, even for non standard conforming code. Differential Revision: http://reviews.llvm.org/D25339 llvm-svn: 283490
* Revert "Add -strip-nonlinetable-debuginfo capability"Michael Ilseman2016-10-064-163/+0
| | | | | | | | This reverts commit r283473. Reverted until review is completed. llvm-svn: 283478
* Add -strip-nonlinetable-debuginfo capabilityMichael Ilseman2016-10-064-0/+163
| | | | | | | | | | | | | | | | | | | | | | This adds a new function to DebugInfo.cpp that takes an llvm::Module as input and removes all debug info metadata that is not directly needed for line tables, thus effectively stripping all type and variable information from the module. The primary motivation for this feature was the bitcode work flow (cf. http://lists.llvm.org/pipermail/llvm-dev/2016-June/100643.html for more background). This is not wired up yet, but will be in subsequent patches. For testing, the new functionality is exposed to opt with a -strip-nonlinetable-debuginfo option. The secondary use-case (and one that works right now!) is as a reduction pass in bugpoint. I added two new bugpoint options (-disable-strip-debuginfo and -disable-strip-debug-types) to control the new features. By default it will first attempt to remove all debug information, then only the type info, and then proceed to hack at any remaining MDNodes. llvm-svn: 283473
* Don't filter diagnostics written as YAML to the output fileHal Finkel2016-10-041-0/+2
| | | | | | | | | | | | | | | The purpose of the YAML diagnostic output file is to collect information on optimizations performed, or not performed, for later processing by tools that help users (and compiler developers) understand how code was optimized. As such, the diagnostics that appear in the file should not be coupled to what a user might want to see summarized for them as the compiler runs, and in fact, because the user likely does not know what optimization diagnostics their tools might want to use, the user cannot provide a useful filter regardless. As such, we shouldn't filter the diagnostics going to the output file. Differential Revision: https://reviews.llvm.org/D25224 llvm-svn: 283236
* Serialize remark argument as a mapping to get proper quotation for the value.Adam Nemet2016-10-042-11/+11
| | | | llvm-svn: 283231
* [SLPVectorizer] Add a test with non-vectorizable IR.Alexey Bataev2016-10-041-0/+290
| | | | llvm-svn: 283225
* [RS4GC] Handle ShuffleVector instruction in findBasePointerAnna Thomas2016-10-041-0/+79
| | | | | | | | | | | | | | | Summary: This patch modifies the findBasePointer to handle the shufflevector instruction. Tests run: RS4GC tests, local downstream tests. Reviewers: reames, sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25197 llvm-svn: 283219
* [PruneEH] Be correct in the face IPOSanjoy Das2016-10-031-0/+43
| | | | | | | This fixes one spot I had missed in r265762. Credit goes to Philip Reames for spotting this one! llvm-svn: 283137
* Jump threading: avoid trying to split edge into landingpad block (PR27840)Hans Wennborg2016-10-031-0/+33
| | | | | | | | | Splitting the edge is nontrivial because of the landing pad, and we would currently assert trying to do it. Differential Revision: https://reviews.llvm.org/D24680 llvm-svn: 283129
* [x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics ↵Sanjay Patel2016-10-031-176/+100
| | | | | | | | | | | | | | | | (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119
* [SLPVectorizer][X86] Added fptosi/fptoui testsSimon Pilgrim2016-10-012-0/+1146
| | | | llvm-svn: 283048
* [SLPVectorizer][X86] Added fcopysign testsSimon Pilgrim2016-10-011-0/+404
| | | | llvm-svn: 283046
* [SLPVectorizer][X86] Added fabs testsSimon Pilgrim2016-10-011-0/+274
| | | | llvm-svn: 283045
* [InstCombine] allow non-splat folds of select cond (ext X), CSanjay Patel2016-09-301-4/+4
| | | | llvm-svn: 282906
* Revert test change in r282894 as it's broken in some platforms.Dehao Chen2016-09-301-40/+2
| | | | llvm-svn: 282903
* [Coroutines] Part15c: Fix coro-split to correctly handle definitions between ↵Gor Nishanov2016-09-301-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | coro.save and coro.suspend Summary: In the case below, %Result.i19 is defined between coro.save and coro.suspend and used after coro.suspend. We need to correctly place such a value into the coroutine frame. ``` %save = call token @llvm.coro.save(i8* null) %Result.i19 = getelementptr inbounds %"struct.lean_future<int>::Awaiter", %"struct.lean_future<int>::Awaiter"* %ref.tmp7, i64 0, i32 0 %suspend = call i8 @llvm.coro.suspend(token %save, i1 false) switch i8 %suspend, label %exit [ i8 0, label %await.ready i8 1, label %exit ] await.ready: %val = load i32, i32* %Result.i19 ``` Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24418 llvm-svn: 282902
* [InstCombine] add tests for non-splat select(ext)Sanjay Patel2016-09-301-0/+22
| | | | llvm-svn: 282901
* [Coroutines] Part15b: Fix dbg information handling in coro-split.Gor Nishanov2016-09-301-0/+119
| | | | | | | | | | | | | | | Summary: Without the fix, if there was a function inlined into the coroutine with debug information, CloneFunctionInto(NewF, &F, VMap, /*ModuleLevelChanges=*/true, Returns); would duplicate all of the debug information including the DICompileUnit. We know use VMap to indicate that debug metadata for a File, Unit and FunctionType should not be duplicated when we creating clones that will become f.resume, f.destroy and f.cleanup. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24417 llvm-svn: 282899
* [Coroutines] Part 15a: Lower coro.subfn.addr in CoroCleanupGor Nishanov2016-09-301-0/+18
| | | | | | | | | | | | Summary: Not all coro.subfn.addr intrinsics can be eliminated in CoroElide through devirtualization. Those that remain need to be lowered in CoroCleanup. Reviewers: majnemer Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24412 llvm-svn: 282897
* clean up tests and auto-generate checksSanjay Patel2016-09-301-56/+124
| | | | llvm-svn: 282896
* Update loop unroller cost model to make sure debug info does not affect ↵Dehao Chen2016-09-301-2/+40
| | | | | | | | | | | | | | optimization decisions. Summary: Debug info should *not* affect optimization decisions. This patch updates loop unroller cost model to make it not affected by debug info. Reviewers: davidxl, mzolotukhin Subscribers: haicheng, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D25098 llvm-svn: 282894
* [InstCombine] add tests for select X, (ext X), CSanjay Patel2016-09-301-0/+91
| | | | llvm-svn: 282891
* [LV] Build all scalar steps for non-uniform induction variablesMatthew Simpson2016-09-301-0/+97
| | | | | | | | | | | | | | | | When building the steps for scalar induction variables, we previously attempted to determine if all the scalar users of the induction variable were uniform. If they were, we would only emit the step corresponding to vector lane zero. This optimization was too aggressive. We generally don't know the entire set of induction variable users that will be scalar. We have isScalarAfterVectorization, but this is only a conservative estimate of the instructions that will be scalarized. Thus, an induction variable may have scalar users that aren't already known to be scalar. To avoid emitting unused steps, we can only check that the induction variable is uniform. This should fix PR30542. Reference: https://llvm.org/bugs/show_bug.cgi?id=30542 llvm-svn: 282863
* [thinlto] Don't decay threshold for hot callsitesPiotr Padlewski2016-09-302-20/+79
| | | | | | | | | | | | | | Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 llvm-svn: 282833
* [thinlto] Add cold-callsite import heuristicPiotr Padlewski2016-09-291-4/+14
| | | | | | | | | | | | | | Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 llvm-svn: 282733
* Wisely choose sext or zext when widening IV.Evgeny Stupachenko2016-09-281-0/+270
| | | | | | | | | | | | Summary: The patch fixes regression caused by two earlier patches D18777 and D18867. Reviewers: reames, sanjoy Differential Revision: http://reviews.llvm.org/D24280 From: Li Huang llvm-svn: 282650
* [InstCombine] update to use FileCheckSanjay Patel2016-09-281-263/+410
| | | | | | | Also, remove unnecessary function attributes, parameters, and comments. It looks like at least some of these tests are not minimal though... llvm-svn: 282620
* Don't look through addrspacecast in GetPointerBaseWithConstantOffsetArtur Pilipenko2016-09-283-13/+35
| | | | | | | | | | | | Pointers in different addrspaces can have different sizes, so it's not valid to look through addrspace cast calculating base and offset for a value. This is similar to D13008. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D24729 llvm-svn: 282612
* [InstSimplify] allow or-of-icmps folds with vector splat constantsSanjay Patel2016-09-281-32/+6
| | | | llvm-svn: 282592
* [InstSimplify] add vector splat tests for or-of-icmpsSanjay Patel2016-09-281-0/+92
| | | | llvm-svn: 282591
* [InstSimplify] allow and-of-icmps folds with vector splat constantsSanjay Patel2016-09-281-0/+66
| | | | llvm-svn: 282590
* [Inliner] Port all opt remarks to new streaming APIAdam Nemet2016-09-271-0/+83
| | | | llvm-svn: 282559
* Pass -S to opt in this test to avoid printing binary on mismatchAdam Nemet2016-09-271-1/+1
| | | | | | The purpose of the test is to verify diagnostics. llvm-svn: 282558
* [Inliner] Fold the analysis remark into the missed remarkAdam Nemet2016-09-272-8/+8
| | | | | | | | | | | | | | | There is really no reason for these to be separate. The vectorizer started this pretty bad tradition that the text of the missed remarks is pretty meaningless, i.e. vectorization failed. There, you have to query analysis to get the full picture. I think we should just explain the reason for missing the optimization in the missed remark when possible. Analysis remarks should provide information that the pass gathers regardless whether the optimization is passing or not. llvm-svn: 282542
* [LoopSimplify] When simplifying phis in loop-simplify, do it only if it ↵Michael Zolotukhin2016-09-271-0/+37
| | | | | | preserves LCSSA form. llvm-svn: 282541
* Output optimization remarks in YAMLAdam Nemet2016-09-271-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (Re-committed after moving the template specialization under the yaml namespace. GCC was complaining about this.) This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282539
* Revert "Output optimization remarks in YAML"Adam Nemet2016-09-271-76/+0
| | | | | | | | This reverts commit r282499. The GCC bots are failing llvm-svn: 282503
* Output optimization remarks in YAMLAdam Nemet2016-09-271-0/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows various presentation of this data using an external tool. This was first recommended here[1]. As an example, consider this module: 1 int foo(); 2 int bar(); 3 4 int baz() { 5 return foo() + bar(); 6 } The inliner generates these missed-optimization remarks today (the hotness information is pulled from PGO): remark: /tmp/s.c:5:10: foo will not be inlined into baz (hotness: 30) remark: /tmp/s.c:5:18: bar will not be inlined into baz (hotness: 30) Now with -pass-remarks-output=<yaml-file>, we generate this YAML file: --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 10 } Function: baz Hotness: 30 Args: - Callee: foo - String: will not be inlined into - Caller: baz ... --- !Missed Pass: inline Name: NotInlined DebugLoc: { File: /tmp/s.c, Line: 5, Column: 18 } Function: baz Hotness: 30 Args: - Callee: bar - String: will not be inlined into - Caller: baz ... This is a summary of the high-level decisions: * There is a new streaming interface to emit optimization remarks. E.g. for the inliner remark above: ORE.emit(DiagnosticInfoOptimizationRemarkMissed( DEBUG_TYPE, "NotInlined", &I) << NV("Callee", Callee) << " will not be inlined into " << NV("Caller", CS.getCaller()) << setIsVerbose()); NV stands for named value and allows the YAML client to process a remark using its name (NotInlined) and the named arguments (Callee and Caller) without parsing the text of the message. Subsequent patches will update ORE users to use the new streaming API. * I am using YAML I/O for writing the YAML file. YAML I/O requires you to specify reading and writing at once but reading is highly non-trivial for some of the more complex LLVM types. Since it's not clear that we (ever) want to use LLVM to parse this YAML file, the code supports and asserts that we're writing only. On the other hand, I did experiment that the class hierarchy starting at DiagnosticInfoOptimizationBase can be mapped back from YAML generated here (see D24479). * The YAML stream is stored in the LLVM context. * In the example, we can probably further specify the IR value used, i.e. print "Function" rather than "Value". * As before hotness is computed in the analysis pass instead of DiganosticInfo. This avoids the layering problem since BFI is in Analysis while DiagnosticInfo is in IR. [1] https://reviews.llvm.org/D19678#419445 Differential Revision: https://reviews.llvm.org/D24587 llvm-svn: 282499
* Revert r277556. Add -lowertypetests-bitsets-level to control bitsets generationIvan Krasin2016-09-271-31/+21
| | | | | | | | | | | | | | | | Summary: We don't currently need this facility for CFI. Disabling individual hot methods proved to be a better strategy in Chrome. Also, the design of the feature is suboptimal, as pointed out by Peter Collingbourne. Reviewers: pcc Subscribers: kcc Differential Revision: https://reviews.llvm.org/D24948 llvm-svn: 282461
* [thinlto] Basic thinlto fdo heuristicPiotr Padlewski2016-09-262-0/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 llvm-svn: 282437
* Remove pruning of phi nodes in MemorySSA - it makes updating harderDaniel Berlin2016-09-261-53/+0
| | | | | | | | | | Reviewers: george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24923 llvm-svn: 282419
* [LV] Scalarize instructions marked scalar after vectorizationMatthew Simpson2016-09-265-7/+83
| | | | | | | | | This patch ensures that we actually scalarize instructions marked scalar after vectorization. Previously, such instructions may have been vectorized instead. Differential Revision: https://reviews.llvm.org/D23889 llvm-svn: 282418
* [Coroutines] Part14: Handle coroutines with no suspend points.Gor Nishanov2016-09-261-0/+189
| | | | | | | | | | | | | | | Summary: If coroutine has no suspend points, remove heap allocation and turn a coroutine into a normal function. Also, if a pattern is detected that coroutine resumes or destroys itself prior to coro.suspend call, turn the suspend point into a simple jump to resume or cleanup label. This pattern occurs when coroutines are used to propagate errors in functions that return expected<T>. Reviewers: majnemer Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24408 llvm-svn: 282414
* [InstCombine] Fixed bug introduced in r282237Alexey Bataev2016-09-261-0/+9
| | | | | | | | The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. llvm-svn: 282401
* [InstCombine] Teach the udiv folding logic how to handle constant expressions.Andrea Di Biagio2016-09-261-0/+15
| | | | | | | | | | | | | | | | | | | This patch fixes PR30366. Function foldUDivShl() worked under the assumption that one of the values in input to the function was always an instance of llvm::Instruction. However, function visitUDivOperand() (the only user of foldUDivShl) was clearly violating that precondition; internally, visitUDivOperand() uses pattern matches to check the operands of a udiv. Pattern matchers for binary operators know how to handle both Instruction and ConstantExpr values. This patch fixes the problem in foldUDivShl(). Now we use pattern matchers instead of explicit casts to Instruction. The reduced test case from PR30366 has been added to test file InstCombine/udiv-simplify.ll. Differential Revision: https://reviews.llvm.org/D24565 llvm-svn: 282398
* [TLI] isdigit / isascii / toascii param type should match return type (PR30484)Sanjay Patel2016-09-231-0/+35
| | | | | | We crash in LibCallSimplifier if we don't check the validity of the function signature properly. llvm-svn: 282278
* [InstCombine] Fix for PR29124: reduce insertelements to shufflevectorAlexey Bataev2016-09-234-9/+35
| | | | | | | | | | | | | | | | | | | | | | | | | If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 llvm-svn: 282237
* [InstCombine] fold X urem C -> X < C ? X : X - C when C is big (PR28672)Sanjay Patel2016-09-221-3/+11
| | | | | | | | | | | | We already have the udiv variant of this transform, so I think this is ok for InstCombine too even though there is an increase in IR instructions. As the tests and TODO comments show, the transform can lead to follow-on combines. This should fix: https://llvm.org/bugs/show_bug.cgi?id=28672 Differential Revision: https://reviews.llvm.org/D24527 llvm-svn: 282209
* Revert r282168 "GVN-hoist: fix store past load dependence analysis (PR30216)"Hans Wennborg2016-09-221-52/+0
| | | | | | | | and also the dependent r282175 "GVN-hoist: do not dereference null pointers" It's causing compiler crashes building Harfbuzz (PR30499). llvm-svn: 282199
OpenPOWER on IntegriCloud