bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses	Adam Nemet	2016-03-18	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771
*	[LoopVectorize] Annotate versioned loop with noalias metadata	Adam Nemet	2016-03-17	2	-23/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use the new LoopVersioning facility (D16712) to add noalias metadata in the vector loop if we versioned with memchecks. This can enable some optimization opportunities further down the pipeline (see the included test or the benchmark improvement quoted in D16712). The test also covers the bug I had in the initial version in D16712. The vectorizer did not previously use LoopVersioning. The reason is that the vectorizer performs its transformations in single shot. It creates an empty single-block vector loop that it then populates with the widened, if-converted instructions. Thus creating an intermediate versioned scalar loop seems wasteful. So this patch (rather than bringing in LoopVersioning fully) adds a special interface to LoopVersioning to allow the vectorizer to add no-alias annotation while still performing its own versioning. As the vectorizer propagates metadata from the instructions in the original loop to the vector instructions we also check the pointer in the original instruction and see if LoopVersioning can add no-alias metadata based on the issued memchecks. Reviewers: hfinkel, nadav, mzolotukhin Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17191 llvm-svn: 263744
*	[LoopVersioning] Annotate versioned loop with noalias metadata	Adam Nemet	2016-03-17	2	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes. One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64. The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct. Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other. As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way. Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests. Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state. Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline. Reviewers: hfinkel, nadav, ashutosh.nema Subscribers: mssimpso, aemerson, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D16712 llvm-svn: 263743
*	[InstCombine] Combine A->B->A BitCast	Guozhi Wei	2016-03-17	2	-0/+107
\| \| \| \| \| \| \| \| \| \|	This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734
*	[Statepoints] Export a magic constant into a header; NFC	Sanjoy Das	2016-03-17	1	-1/+1
\| \| \| \|	llvm-svn: 263733
*	propagate 'unpredictable' metadata on select instructions	Sanjay Patel	2016-03-17	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	This is similar to D18133 where we allowed profile weights on select instructions. This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects. A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at: http://reviews.llvm.org/rL263648 Differential Revision: http://reviews.llvm.org/D18220 llvm-svn: 263716
*	[Statepoints] Separate out logic for statepoint directives; NFC	Sanjoy Das	2016-03-17	1	-12/+8
\| \| \| \| \| \| \| \| \| \| \|	This splits out the logic that maps the `"statepoint-id"` attribute into the actual statepoint ID, and the `"statepoint-num-patch-bytes"` attribute into the number of patchable bytes the statpeoint is lowered into. The new home of this logic is in IR/Statepoint.cpp, and this refactoring will support similar functionality when lowering calls with deopt operand bundles in the future. llvm-svn: 263685
*	[SLP] Make DataLayout a member variable.	Chad Rosier	2016-03-16	1	-38/+35
\| \| \| \|	llvm-svn: 263656
*	Revert "[LSR] Create fewer redundant instructions."	Geoff Berry	2016-03-16	1	-22/+20
\| \| \| \| \| \|	This reverts commit r263644. Investigating bootstrap failures. llvm-svn: 263655
*	[msan] Add a comment with a bug link.	Evgeniy Stepanov	2016-03-16	1	-0/+3
\| \| \| \|	llvm-svn: 263645
*	[LSR] Create fewer redundant instructions.	Geoff Berry	2016-03-16	1	-20/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Fix LSRInstance::HoistInsertPosition() to check the original insert position block first for a canonical insertion point that is dominated by all inputs. This leads to SCEV being able to reuse more instructions since it currently tracks the instructions it creates for reuse by keeping a table of <Value, insert point> pairs. Reviewers: atrick Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18001 llvm-svn: 263644
*	[JumpThreading] See through Cast Instructions	Haicheng Wu	2016-03-16	1	-0/+19
\| \| \| \| \| \|	To capture more jump-thread opportunity. llvm-svn: 263618
*	Revert "[JumpThreading] Simplify Instructions first in ↵	Haicheng Wu	2016-03-15	1	-35/+20
\| \| \| \| \| \| \| \|	ComputeValueKnownInPredecessors()" Not sure it handles undef properly. llvm-svn: 263605
*	Turn LoopLoadElimination on again	Adam Nemet	2016-03-15	1	-2/+2
\| \| \| \| \| \| \| \| \|	The latent bug that LLE exposed in the LoopVectorizer was resolved (PR26952). The pass can be disabled with -mllvm -enable-loop-load-elim=0 llvm-svn: 263595
*	Also handle the new Rust pers fn to isCatchAll()	Bjorn Steinbrink	2016-03-15	1	-2/+3
\| \| \| \|	llvm-svn: 263585
*	[msan] Don't put module constructors in comdats.	Evgeniy Stepanov	2016-03-15	1	-3/+10
\| \| \| \| \| \| \| \| \|	There is something strange going on with debug info (.eh_frame_hdr) disappearing when msan.module_ctor are placed in comdat sections. Moving this functionality under flag, disabled by default. llvm-svn: 263579
*	[LV] Preserve LoopInfo when store predication is used	Adam Nemet	2016-03-15	2	-6/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was a latent bug that got exposed by the change to add LoopSimplify as a dependence to LoopLoadElimination. Since LoopInfo was corrupted after LV, LoopSimplify mis-compiled nbench in the test-suite (more details in the PR). The problem was that when we create the blocks for predicated stores we didn't add those to any loops. The original testcase for store predication provides coverage for this assuming we verify LI on the way out of LV. Fixes PR26952. llvm-svn: 263565
*	[GlobalOpt] Don't look through aliases when sorting names of globals.	Benjamin Kramer	2016-03-15	1	-2/+3
\| \| \| \| \| \| \|	If both are different aliases to the same value the sorting becomes non-deterministic as array_pod_sort is not stable. llvm-svn: 263550
*	[SLP] Update comment to reflect reality. NFC.	Chad Rosier	2016-03-15	1	-2/+2
\| \| \| \|	llvm-svn: 263548
*	Use some braces to format this a little better.	Eric Christopher	2016-03-15	1	-5/+9
\| \| \| \|	llvm-svn: 263527
*	Fix llvm/llvm/lib/Transforms/Utils/LoopUnroll.cpp:285:53: error: suggest	Eric Christopher	2016-03-15	1	-11/+9
\| \| \| \| \| \|	parentheses around '&&' within '\|\|' [-Werror=parentheses]. llvm-svn: 263525
*	Move global ID computation from Function to GlobalValue (NFC)	Teresa Johnson	2016-03-15	1	-1/+1
\| \| \| \| \| \| \| \|	Since the static getGlobalIdentifier and getGUID methods are now called for global values other than functions, reflect that by moving these methods to the GlobalValue class. llvm-svn: 263524
*	[ThinLTO] Renaming of function index to module summary index (NFC)	Teresa Johnson	2016-03-15	3	-30/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Resubmitting after fixing missing file issue) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263513
*	[LoopUnroll] Respect the convergent attribute.	Justin Lebar	2016-03-14	2	-5/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Specifically, when we perform runtime loop unrolling of a loop that contains a convergent op, we can only unroll k times, where k divides the loop trip multiple. Without this change, we'll happily unroll e.g. the following loop for (int i = 0; i < N; ++i) { if (i == 0) convergent_op(); foo(); } into int i = 0; if (N % 2 == 1) { convergent_op(); foo(); ++i; } for (; i < N - 1; i += 2) { if (i == 0) convergent_op(); foo(); foo(); }. This is unsafe, because we've just added a control-flow dependency to the convergent op in the prelude. In general, runtime unrolling loops that contain convergent ops is safe only if we don't have emit a prelude, which occurs when the unroll count divides the trip multiple. Reviewers: resistor Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17526 llvm-svn: 263509
*	Imporove load to store => memcpy	Amaury Sechet	2016-03-14	1	-18/+99
\| \| \| \| \| \| \| \| \| \|	Summary: This now try to reorder instructions in order to help create the optimizable pattern. Reviewers: craig.topper, spatel, dexonsmith, Prazek, chandlerc, joker.eph, majnemer Differential Revision: http://reviews.llvm.org/D16523 llvm-svn: 263503
*	Revert "[ThinLTO] Renaming of function index to module summary index (NFC)"	Teresa Johnson	2016-03-14	3	-31/+30
\| \| \| \| \| \|	This reverts commit r263490. Missed a file. llvm-svn: 263493
*	[ThinLTO] Renaming of function index to module summary index (NFC)	Teresa Johnson	2016-03-14	3	-30/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. llvm-svn: 263490
*	Revert "Turn LoopLoadElimination on again"	Adam Nemet	2016-03-14	1	-2/+2
\| \| \| \| \| \| \| \| \|	This reverts commit r263472. There is an LNT failure on clang-ppc64be-linux-lnt. Turn this off, while I am investigating. llvm-svn: 263485
*	allow branch weight metadata on select instructions (PR26636)	Sanjay Patel	2016-03-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in: https://llvm.org/bugs/show_bug.cgi?id=26636 This doesn't accomplish anything on its own. It's the first step towards preserving and using branch weights with selects. The next step would be to make sure we're propagating the info in all of the other places where we create selects (SimplifyCFG, InstCombine, etc). I don't think there's an easy fix to make this happen; we have to look at each transform individually to determine how to correctly propagate the weights. Along with that step, we need to then use the weights when making subsequent transform decisions such as discussed in http://reviews.llvm.org/D16836. The inliner test is independent but closely related. It verifies that metadata is preserved when both branches and selects are cloned. Differential Revision: http://reviews.llvm.org/D18133 llvm-svn: 263482
*	[attrs] Handle convergent CallSites.	Justin Lebar	2016-03-14	2	-40/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Originally landed as r261544, then reverted in r261544 for (incidental) build breakage. Re-landed here with no changes. Reviewers: chandlerc, jingyue Subscribers: llvm-commits, tra, jhen, hfinkel Differential Revision: http://reviews.llvm.org/D17739 llvm-svn: 263481
*	[SLPVectorizer] Fix dependency list	Keno Fischer	2016-03-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: DemandedBits was added to the requirements of SLPVectorizer in rL261212 (and various earlier version of it), but the appropriate initialization statement was accidentally forgotten. Ref [[ https://github.com/JuliaLang/julia/issues/14998 \| JuliaLang/julia#14998 ]]. Patch by Yichao Yu. Reviewers: mssimpso Differential Revision: http://reviews.llvm.org/D18152 llvm-svn: 263476
*	Turn LoopLoadElimination on again	Adam Nemet	2016-03-14	1	-2/+2
\| \| \| \| \| \| \| \|	The two issues that were discovered got fixed (r263058, r263173). The pass can be disabled with -mllvm -enable-loop-load-elim=0 llvm-svn: 263472
*	[CVP] Replace nonnegative with positive, per Philip's request. NFC.	Chad Rosier	2016-03-14	1	-2/+2
\| \| \| \|	llvm-svn: 263430
*	[CVP] Convert an SDiv to a UDiv if both operands are known to be nonnegative	Haicheng Wu	2016-03-14	1	-0/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The motivating example is this for (j = n; j > 1; j = i) { i = j / 2; } The signed division is safely to be changed to an unsigned division (j is known to be larger than 1 from the loop guard) and later turned into a single shift without considering the sign bit. llvm-svn: 263406
*	Remove PreserveNames template parameter from IRBuilder	Mehdi Amini	2016-03-13	8	-21/+20
\| \| \| \| \| \| \| \|	This reapplies r263258, which was reverted in r263321 because of issues on Clang side. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263393
*	remove unnecessary cast; NFC	Sanjay Patel	2016-03-12	1	-4/+3
\| \| \| \|	llvm-svn: 263343
*	fix formatting; NFC	Sanjay Patel	2016-03-12	1	-12/+12
\| \| \| \|	llvm-svn: 263342
*	use range loops; NFCI	Sanjay Patel	2016-03-12	1	-21/+18
\| \| \| \|	llvm-svn: 263341
*	[x86, InstCombine] delete x86 SSE2 masked store with zero mask	Sanjay Patel	2016-03-12	1	-0/+6
\| \| \| \| \| \| \| \| \|	This follows up on the related AVX instruction transforms, but this one is too strange to do anything more with. Intel's behavioral description of this instruction in its Software Developer's Manual is tragi-comic. llvm-svn: 263340
*	Temporarily revert:	Eric Christopher	2016-03-12	8	-20/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit ae14bf6488e8441f0f6d74f00455555f6f3943ac Author: Mehdi Amini <mehdi.amini@apple.com> Date: Fri Mar 11 17:15:50 2016 +0000 Remove PreserveNames template parameter from IRBuilder Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263258 91177308-0d34-0410-b5e6-96231b3b80d8 until we can figure out what to do about clang and Release build testing. This reverts commit 263258. llvm-svn: 263321
*	[MemorySSA] Make a return type reflect reality. NFC.	George Burgess IV	2016-03-11	1	-11/+12
\| \| \| \|	llvm-svn: 263286
*	Introduce @llvm.experimental.deoptimize	Sanjoy Das	2016-03-11	1	-1/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This intrinsic, together with deoptimization operand bundles, allow frontends to express transfer of control and frame-local state from one (typically more specialized, hence faster) version of a function into another (typically more generic, hence slower) version. In languages with a fully integrated managed runtime this intrinsic can be used to implement "uncommon trap" like functionality. In unmanaged languages like C and C++, this intrinsic can be used to represent the slow paths of specialized functions. Note: this change does not address how `@llvm.experimental_deoptimize` is lowered. That will be done in a later change. Reviewers: chandlerc, rnk, atrick, reames Subscribers: llvm-commits, kmod, mjacob, maksfb, mcrosier, JosephTremoulet Differential Revision: http://reviews.llvm.org/D17732 llvm-svn: 263281
*	[PGO] Skip value profile instrumentation of inline asm	Vedant Kumar	2016-03-11	1	-1/+1
\| \| \| \| \| \| \| \| \|	Value profile instrumentation treats inline asm calls like they are indirect calls. This causes problems when the 'Callee' is passed to a ptrtoint cast -- the verifier rightly claims that this is bogus and crashes opt. llvm-svn: 263278
*	[ThinLTO] Support for reference graph in per-module and combined summary.	Teresa Johnson	2016-03-11	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support for including a full reference graph including call graph edges and other GV references in the summary. The reference graph edges can be used to make importing decisions without materializing any source modules, can be used in the plugin to make file staging decisions for distributed build systems, and is expected to have other uses. The call graph edges are recorded in each function summary in the bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when there is PGO, where the ValueId can be mapped to the function GUID via the ValueSymbolTable. In the function index in memory, the call graph edges reference the target via the CalleeGUID instead of the CalleeValueId. The reference graph edges are recorded in each summary record with a list of referenced value IDs, which can be mapped to value GUID via the ValueSymbolTable. Addtionally, a new summary record type is added to record references from global variable initializers. A number of bitcode records and data structures have been renamed to reflect the newly expanded scope of the summary beyond functions. More cleanup will follow. Reviewers: joker.eph, davidxl Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D17212 llvm-svn: 263275
*	Remove PreserveNames template parameter from IRBuilder	Mehdi Amini	2016-03-11	8	-21/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Following r263086, we are now relying on a flag on the Context to discard Value names in release builds. Reviewers: chandlerc Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18023 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263258
*	Do not specialize IRBuilder to strip names in SROA	Mehdi Amini	2016-03-11	1	-22/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Following r263086, we are replacing this by a runtime check. More cleanup will follow on the IRBuilder itself, but I submitted this patch separately as SROA has a fancy "prefixInserter" class that needs extra-love. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18022 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263256
*	[PM] Sink the "Expression" type for GVN into the class as a private	Chandler Carruth	2016-03-11	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	member type. Because of how this type is used by the ValueTable, it cannot actually have hidden visibility. GCC actually nicely warns about this but Clang just silently ... I don't even know. =/ We should do a better job either way though. This should resolve a bunch of the GCC warnings about visibility that the port of GVN triggered and make the visibility story a bit more correct. llvm-svn: 263250
*	[PM] The order of evaluation of these analyses is actually significant,	Chandler Carruth	2016-03-11	1	-5/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	much to my horror, so use variables to fix it in place. This terrifies me. Both basic-aa and memdep will provide more precise information when the domtree and/or the loop info is available. Because of this, if your pass (like GVN) requires domtree, and then queries memdep or basic-aa, it will get more precise results. If it does this in the other order, it gets less precise results. All of the ideas I have for fixing this are, essentially, terrible. Here I've just caused us to stop having unspecified behavior as different implementations evaluate the order of these arguments differently. I'm actually rather glad that they do, or the fragility of memdep and basic-aa would have gone on unnoticed. I've left comments so we don't immediately break this again. This should fix bots whose host compilers evaluate the order of arguments differently from Clang. llvm-svn: 263231
*	[PM] Make the AnalysisManager parameter to run methods a reference.	Chandler Carruth	2016-03-11	7	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \|	This was originally a pointer to support pass managers which didn't use AnalysisManagers. However, that doesn't realistically come up much and the complexity of supporting it doesn't really make sense. In fact, many parts of the pass manager were just assuming the pointer was never null already. This at least makes it much more explicit and clear. llvm-svn: 263219
*	[InstCombine] Use Twines to generate names.	Benjamin Kramer	2016-03-11	1	-15/+5
\| \| \| \| \| \| \| \| \|	Since the names are used in a loop this does more work in debug builds. In release builds value names are generally discarded so we don't have to do the concatenation at all. It's also simpler code, no functional change intended. llvm-svn: 263215