bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PassManagerBuilder] Remove global extension when a plugin is unloaded	Elia Geretto	2020-01-29	1	-8/+33
\| \| \| \| \| \| \| \| \| \| \| \|	This commit fixes PR39321. GlobalExtensions is not guaranteed to be destroyed when optimizer plugins are unloaded. If it is indeed destroyed after a plugin is dlclose-d, the destructor of the corresponding ExtensionFn is not mapped anymore, causing a call to unmapped memory during destruction. This commit guarantees that extensions coming from external plugins are removed from GlobalExtensions when the plugin is unloaded if GlobalExtensions has not been destroyed yet. Differential Revision: https://reviews.llvm.org/D71959 (cherry picked from commit ab2300bc154f7bed43f85f74fd3fe31be71d90e0)
*	[Matrix] Add first set of matrix intrinsics and initial lowering pass.	Florian Hahn	2019-12-12	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the first patch adding an initial set of matrix intrinsics and a corresponding lowering pass. This has been discussed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2019-October/136240.html The first patch introduces four new intrinsics (transpose, multiply, columnwise load and store) and a LowerMatrixIntrinsics pass, that lowers those intrinsics to vector operations. Matrixes are embedded in a 'flat' vector (e.g. a 4 x 4 float matrix embedded in a <16 x float> vector) and the intrinsics take the dimension information as parameters. Those parameters need to be ConstantInt. For the memory layout, we initially assume column-major, but in the RFC we also described how to extend the intrinsics to support row-major as well. For the initial lowering, we split the input of the intrinsics into a set of column vectors, transform those column vectors and concatenate the result columns to a flat result vector. This allows us to lower the intrinsics without any shape propagation, as mentioned in the RFC. In follow-up patches, we plan to submit the following improvements: * Shape propagation to eliminate the embedding/splitting for each intrinsic. * Fused & tiled lowering of multiply and other operations. * Optimization remarks highlighting matrix expressions and costs. * Generate loops for operations on large matrixes. * More general block processing for operation on large vectors, exploiting shape information. We would like to add dedicated transpose, columnwise load and store intrinsics, even though they are not strictly necessary. For example, we could instead emit a large shufflevector instruction instead of the transpose. But we expect that to (1) become unwieldy for larger matrixes (even for 16x16 matrixes, the resulting shufflevector masks would be huge), (2) risk instcombine making small changes, causing us to fail to detect the transpose, preventing better lowerings For the load/store, we are additionally planning on exploiting the intrinsics for better alias analysis. Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor, efriedma, rengolin Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70456
*	Revert "[Attributor] Move pass after InstCombine to futher eliminate null ↵	Dávid Bolvanský	2019-11-27	1	-2/+3
\| \| \| \| \| \|	pointer checks" This reverts commit 7ca7d62c6ea1680ec0a1861083669596547fdd6f. Commited accidentally.
*	[Attributor] Move pass after InstCombine to futher eliminate null pointer checks	Dávid Bolvanský	2019-11-27	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: PR44149 Reviewers: jdoerfert Subscribers: mehdi_amini, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70737
*	Revert "Revert "As a follow-up to my initial mail to llvm-dev here's a first ↵	Eric Christopher	2019-11-26	1	-16/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pass at the O1 described there."" This reapplies: 8ff85ed905a7306977d07a5cd67ab4d5a56fafb4 Original commit message: As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410 This reverts commit c9ddb02659e3ece7a0d9d6b4dac7ceea4ae46e6d.
*	Revert "As a follow-up to my initial mail to llvm-dev here's a first pass at ↵	Muhammad Omair Javaid	2019-11-26	1	-30/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the O1 described there." This reverts commit 8ff85ed905a7306977d07a5cd67ab4d5a56fafb4. This commit introduced 9 new failures on lldb buildbot host at http://lab.llvm.org:8014/builders/lldb-aarch64-ubuntu Following tests were failing: lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq1/TestAmbiguousTailCallSeq1.py lldb-api :: functionalities/tail_call_frames/ambiguous_tail_call_seq2/TestAmbiguousTailCallSeq2.py lldb-api :: functionalities/tail_call_frames/disambiguate_call_site/TestDisambiguateCallSite.py lldb-api :: functionalities/tail_call_frames/disambiguate_paths_to_common_sink/TestDisambiguatePathsToCommonSink.py lldb-api :: functionalities/tail_call_frames/disambiguate_tail_call_seq/TestDisambiguateTailCallSeq.py lldb-api :: functionalities/tail_call_frames/inlining_and_tail_calls/TestInliningAndTailCalls.py lldb-api :: functionalities/tail_call_frames/sbapi_support/TestTailCallFrameSBAPI.py lldb-api :: functionalities/tail_call_frames/thread_step_out_message/TestArtificialFrameStepOutMessage.py lldb-api :: functionalities/tail_call_frames/thread_step_out_or_return/TestSteppingOutWithArtificialFrames.py lldb-api :: functionalities/tail_call_frames/unambiguous_sequence/TestUnambiguousTailCalls.py Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410
*	As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 ↵	Eric Christopher	2019-11-25	1	-16/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	described there. This change doesn't include any change to move from selection dag to fast isel and that will come with other numbers that should help inform that decision. There also haven't been any real debuggability studies with this pipeline yet, this is just the initial start done so that people could see it and we could start tweaking after. Test updates: Outside of the newpm tests most of the updates are coming from either optimization passes not run anymore (and without a compelling argument at the moment) that were largely used for canonicalization in clang. Original post: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html Tags: #llvm Differential Revision: https://reviews.llvm.org/D65410
*	Reapply r374743 with a fix for the ocaml binding	Joerg Sonnenberger	2019-10-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784
*	Revert "Add a pass to lower is.constant and objectsize intrinsics"	Dmitri Gribenko	2019-10-14	1	-1/+0
\| \| \| \| \| \| \|	This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768
*	Add a pass to lower is.constant and objectsize intrinsics	Joerg Sonnenberger	2019-10-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743
*	Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of ↵	Dmitri Gribenko	2019-09-10	1	-14/+0
\| \| \| \| \| \| \| \| \|	CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507
*	Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵	Clement Courbet	2019-09-10	1	-0/+14
\| \| \| \| \| \| \| \|	opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502
*	Provide basic Full LTO extension points	Serge Guelton	2019-07-02	1	-0/+4
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D61738 llvm-svn: 364937
*	Revert "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into ↵	Clement Courbet	2019-06-26	1	-14/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	opt pipeline." Breaks sanitizers: libFuzzer :: cxxstring.test libFuzzer :: memcmp.test libFuzzer :: recommended-dictionary.test libFuzzer :: strcmp.test libFuzzer :: value-profile-mem.test libFuzzer :: value-profile-strcmp.test llvm-svn: 364416
*	[ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.	Clement Courbet	2019-06-26	1	-0/+14
\| \| \| \| \| \| \| \| \|	This allows later passes (in particular InstCombine) to optimize more cases. One that's important to us is `memcmp(p, q, constant) < 0` and memcmp(p, q, constant) > 0. llvm-svn: 364412
*	[Attributor] Pass infrastructure and fixpoint framework	Johannes Doerfert	2019-06-05	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NOTE: Note that no attributes are derived yet. This patch will not go in alone but only with others that derive attributes. The framework is split for review purposes. This commit introduces the Attributor pass infrastructure and fixpoint iteration framework. Further patches will introduce abstract attributes into this framework. In a nutshell, the Attributor will update instances of abstract arguments until a fixpoint, or a "timeout", is reached. Communication between the Attributor and the abstract attributes that are derived is restricted to the AbstractState and AbstractAttribute interfaces. Please see the file comment in Attributor.h for detailed information including design decisions and typical use case. Also consider the class documentation for Attributor, AbstractState, and AbstractAttribute. Reviewers: chandlerc, homerdin, hfinkel, fedor.sergeev, sanjoy, spatel, nlopes, nicholas, reames Subscribers: mehdi_amini, mgorny, hiraditya, bollu, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59918 llvm-svn: 362578
*	[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC].	Alina Sbirlea	2019-05-23	1	-6/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61612 llvm-svn: 361560
*	[NewPassManager] Add tuning option: SLPVectorization [NFC].	Alina Sbirlea	2019-05-08	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61616 llvm-svn: 360276
*	[PassManagerBuilder] Add option for interleaved loops, for loop vectorize.	Alina Sbirlea	2019-04-30	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615
*	Remove the EnableEarlyCSEMemSSA set of options from the legacy	Eric Christopher	2019-04-19	1	-5/+1
\| \| \| \| \| \| \| \| \|	and new pass managers. They were default to true and not being used. Differential Revision: https://reviews.llvm.org/D60747 llvm-svn: 358789
*	[LICM & MemorySSA] Make limit flags pass tuning options.	Alina Sbirlea	2019-04-19	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Make the flags in LICM + MemorySSA tuning options in the old and new pass managers. Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60490 llvm-svn: 358772
*	[NewPassManager] Adding pass tuning options: loop vectorize.	Alina Sbirlea	2019-04-19	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763
*	Elaborate why we have an option on by default for enabling chr.	Eric Christopher	2019-04-18	1	-0/+2
\| \| \| \|	llvm-svn: 358641
*	Remove the run-slp-after-loop-vectorization option.	Eric Christopher	2019-04-17	1	-12/+3
\| \| \| \| \| \| \|	It's been on by default for 4 years and cleans up the pass hierarchy. llvm-svn: 358548
*	[SCEV] Add option to forget everything in SCEV.	Alina Sbirlea	2019-04-12	1	-7/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Create a method to forget everything in SCEV. Add a cl::opt and PassManagerBuilder option to use this in LoopUnroll. Motivation: Certain Halide applications spend a very long time compiling in forgetLoop, and prefer to forget everything and rebuild SCEV from scratch. Sample difference in compile time reduction: 21.04 to 14.78 using current ToT release build. Testcase showcasing this cannot be opensourced and is fairly large. The option disabled by default, but it may be desirable to enable by default. Evidence in favor (two difference runs on different days/ToT state): File Before (s) After (s) clang-9.bc 7267.91 6639.14 llvm-as.bc 194.12 194.12 llvm-dis.bc 62.50 62.50 opt.bc 1855.85 1857.53 File Before (s) After (s) clang-9.bc 8588.70 7812.83 llvm-as.bc 196.20 194.78 llvm-dis.bc 61.55 61.97 opt.bc 1739.78 1886.26 Reviewers: sanjoy Subscribers: mehdi_amini, jlebar, zzheng, javed.absar, dmgreen, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60144 llvm-svn: 358304
*	Resubmit r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"	Robert Lougher	2019-03-20	1	-0/+4
\| \| \| \| \| \|	Failing LLD tests have been fixed in r356593. llvm-svn: 356594
*	Revert r356511 "[TailCallElim] Add tailcall elimination pass to LTO pipelines"	Robert Lougher	2019-03-19	1	-4/+0
\| \| \| \| \| \|	Due to buildbot failures (LLD tests). llvm-svn: 356516
*	[TailCallElim] Add tailcall elimination pass to LTO pipelines	Robert Lougher	2019-03-19	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	LTO provides additional opportunities for tailcall elimination due to link-time inlining and visibility of nocapture attribute. Testing showed negligible impact on compilation times. Differential Revision: https://reviews.llvm.org/D58391 llvm-svn: 356511
*	[PGO] Context sensitive PGO (part 3)	Rong Xu	2019-03-04	1	-9/+34
\| \| \| \| \| \| \| \|	Part 3 of CSPGO changes (mostly related to PassMananger). Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355330
*	Add a module pass for order file instrumentation	Manman Ren	2019-02-28	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The basic idea of the pass is to use a circular buffer to log the execution ordering of the functions. We only log the function when it is first executed. We use a 8-byte hash to log the function symbol name. In this pass, we add three global variables: (1) an order file buffer: a circular buffer at its own llvm section. (2) a bitmap for each module: one byte for each function to say if the function is already executed. (3) a global index to the order file buffer. At the function prologue, if the function has not been executed (by checking the bitmap), log the function hash, then atomically increase the index. Differential Revision: https://reviews.llvm.org/D57463 llvm-svn: 355133
*	[HotColdSplit] Schedule splitting late to fix perf regression	Vedant Kumar	2019-02-15	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With or without PGO data applied, splitting early in the pipeline (either before the inliner or shortly after it) regresses performance across SPEC variants. The cause appears to be that splitting hides context for subsequent optimizations. Schedule splitting late again, in effect reversing r352080, which scheduled the splitting pass early for code size benefits (documented in https://reviews.llvm.org/D57082). Differential Revision: https://reviews.llvm.org/D58258 llvm-svn: 354158
*	[HotColdSplit] Move splitting after instrumented PGO use	Teresa Johnson	2019-02-06	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Follow up to D57082 which moved splitting earlier in the pipeline, in order to perform it before inlining. However, it was moved too early, before the IR is annotated with instrumented PGO data. This caused the splitting to incorrectly determine cold functions. Move it to just after PGO annotation (still before inlining), in both pass managers. Reviewers: vsk, hiraditya, sebpop Subscribers: mehdi_amini, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57805 llvm-svn: 353270
*	[HotColdSplit] Move splitting earlier in the pipeline	Vedant Kumar	2019-01-24	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Performing splitting early has several advantages: - Inhibiting inlining of cold code early improves code size. Compared to scheduling splitting at the end of the pipeline, this cuts code size growth in half within the iOS shared cache (0.69% to 0.34%). - Inhibiting inlining of cold code improves compile time. There's no need to inline split cold functions, or to inline as much within those split functions as they are marked `minsize`. - During LTO, extra work is only done in the pre-link step. Less code must be inlined during cross-module inlining. An additional motivation here is that the most common cold regions identified by the static/conservative splitting heuristic can (a) be found before inlining and (b) do not grow after inlining. E.g. __assert_fail, os_log_error. The disadvantages are: - Some opportunities for splitting out cold code may be missed. This gap can potentially be narrowed by adding a worklist algorithm to the splitting pass. - Some opportunities to reduce code size may be lost (e.g. store sinking, when one side of the CFG diamond is split). This does not outweigh the code size benefits of splitting earlier. On net, splitting early in the pipeline has substantial code size benefits, and no major effects on memory locality or performance. We measured memory locality using ktrace data, and consistently found that 10% fewer pages were needed to capture 95% of text page faults in key iOS benchmarks. We measured performance on frequency-stabilized iOS devices using LNT+externals. This reverses course on the decision made to schedule splitting late in r344869 (D53437). Differential Revision: https://reviews.llvm.org/D57082 llvm-svn: 352080
*	Update the file headers across all of the LLVM projects in the monorepo	Chandler Carruth	2019-01-19	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
*	[SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink	Wei Mi	2019-01-17	1	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the sample profile has no inlining hierachy information included, we call the sample profile is flattened. For flattened profile, in ThinLTO postlink phase, SampleProfileLoader's hot function inlining and profile annotation will do nothing, so it is better to save the effort to read in the profile and run the sample profile loader pass. It is helpful for reducing compile time when the flattened profile is huge. Differential Revision: https://reviews.llvm.org/D54819 llvm-svn: 351476
*	Fix a mistake in rL351392.	Wei Mi	2019-01-16	1	-1/+1
\| \| \| \| \| \|	PGOInstrGen should be initialized to "" instead of false. llvm-svn: 351397
*	[PGO] Make pgo related options in opt more consistent.	Wei Mi	2019-01-16	1	-17/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we have pgo options defined in PassManagerBuilder.cpp only for instrument pgo, but not for sample pgo. We also have pgo options defined in NewPMDriver.cpp in opt only for new pass manager and for all kinds of pgo. They have some inconsistency. To make the options more consistent and make tests writing easier, the patch let old pass manager to share the same pgo options with new pass manager in opt, and removes the options in PassManagerBuilder.cpp. Differential Revision: https://reviews.llvm.org/D56749 llvm-svn: 351392
*	[ThinLTO] Handle chains of aliases	Teresa Johnson	2019-01-04	1	-7/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At -O0, globalopt is not run during the compile step, and we can have a chain of an alias having an immediate aliasee of another alias. The summaries are constructed assuming aliases in a canonical form (flattened chains), and as a result only the base object but no intermediate aliases were preserved. Fix by adding a pass that canonicalize aliases, which ensures each alias is a direct alias of the base object. Reviewers: pcc, davidxl Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D54507 llvm-svn: 350423
*	[LoopVectorize] Rename pass options. NFC.	Michael Kruse	2018-12-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513
*	[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops.	Michael Kruse	2018-12-18	1	-15/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509
*	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.	Michael Kruse	2018-12-12	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944
*	[LTO] Load sample profile in LTO link step.	Xin Tong	2018-11-15	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Load sample profile in LTO link step. ThinLTO calls populateModulePassManager to load the profile Reviewers: tejohnson, davidxl, danielcdh Subscribers: mehdi_amini, inglorion, steven_wu, dexonsmith, llvm-commits Differential Revision: https://reviews.llvm.org/D54564 llvm-svn: 346971
*	Schedule Hot Cold Splitting pass after most optimization passes	Aditya Kumar	2018-10-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In the new+old pass manager, hot cold splitting was schedule too early. Thanks to Vedant for pointing this out. Reviewers: sebpop, vsk Reviewed By: sebpop, vsk Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D53437 llvm-svn: 344869
*	Temporarily revert "[GVNHoist] Re-enable GVNHoist by default"	Eric Christopher	2018-10-01	1	-2/+2
\| \| \| \| \| \| \| \| \|	This reverts commit r342387 as it's showing significant performance regressions in a number of benchmarks. Followed up with the committer and original thread with an example and will get performance numbers before recommitting. llvm-svn: 343522
*	Recommit r343308: [LoopInterchange] Turn into a loop pass.	Florian Hahn	2018-10-01	1	-4/+2
\| \| \| \|	llvm-svn: 343450
*	Revert "[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints"	whitequark	2018-09-28	1	-7/+0
\| \| \| \| \| \| \| \| \|	This reverts commit c4baf7c2f06ff5459c4f5998ce980346e72bff97. Broke the bots, and should really be in Transforms/Coroutines instead. llvm-svn: 343337
*	[LLVM-C] Add bindings for addCoroutinePassesToExtensionPoints	whitequark	2018-09-28	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds bindings to C and Go for addCoroutinePassesToExtensionPoints, which is used to add coroutine passes to the correct locations in PassManagerBuilder. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: mehdi_amini, modocache, llvm-commits Differential Revision: https://reviews.llvm.org/D51642 llvm-svn: 343336
*	Revert r343308: [LoopInterchange] Turn into a loop pass.	Florian Hahn	2018-09-28	1	-2/+4
\| \| \| \|	llvm-svn: 343310
*	[LoopInterchange] Turn into a loop pass.	Florian Hahn	2018-09-28	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch turns LoopInterchange into a loop pass. It now only considers top-level loops and tries to move the innermost loop to the optimal position within the loop nest. By only looking at top-level loops, we might miss a few opportunities the function pass would get (e.g. if we have a loop nest of 3 loops, in the function pass we might process loops at level 1 and 2 and move the inner most loop to level 1, and then we process loops at levels 0, 1, 2 and interchange again, because we now have a different inner loop). But I think it would be better to handle such cases by picking the best inner loop from the start and avoid re-visiting the same loops again. The biggest advantage of it being a function pass is that it interacts nicely with the other loop passes. Without this patch, there are some performance regressions on AArch64 with loop interchanging enabled, where no loops were interchanged, but we missed out on some other loop optimizations. It also removes the SimplifyCFG run. We are just changing branches, so the CFG should not be more complicated, besides the additional 'unique' preheaders this pass might create. Reviewers: chandlerc, efriedma, mcrosier, javed.absar, xbolva00 Reviewed By: xbolva00 Differential Revision: https://reviews.llvm.org/D51702 llvm-svn: 343308
*	[GVNHoist] Re-enable GVNHoist by default	Alexandros Lamprineas	2018-09-17	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Rebase rL341954 since https://bugs.llvm.org/show_bug.cgi?id=38912 has been fixed by rL342055. Precommit testing performed: * Overnight runs of csmith comparing the output between programs compiled with gvn-hoist enabled/disabled. * Bootstrap builds of clang with UbSan/ASan configurations. llvm-svn: 342387