bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Rework linkInModule(), making it oblivious to ThinLTO	Mehdi Amini	2016-03-19	3	-72/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: ThinLTO is relying on linkInModule to import selected function. However a lot of "magic" was hidden in linkInModule and the IRMover, who would rename and promote global variables on the fly. This is moving to an approach where the steps are decoupled and the client is reponsible to specify the list of globals to import. As a consequence some test are changed because they were relying on the previous behavior which was importing the definition of every single global without control on the client side. Now the burden is on the client to decide if a global has to be imported or not. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18122 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 263863
*	[CXX_FAST_TLS] Fix issues in ARM.	Manman Ren	2016-03-18	1	-2/+3
\| \| \| \| \| \| \| \| \|	We need to be careful on which registers can be explicitly handled via copies. Prologue, Epilogue use physical registers and if one belongs to the set of CSRsViaCopy, it will no longer be CSRed, since PEI overwrites it after the explicit copies. llvm-svn: 263857
*	[CXX_FAST_TLS] Disable tail call when calling conventions are mismatched.	Manman Ren	2016-03-18	3	-0/+21
\| \| \| \| \| \| \|	Since CXX_FAST_TLS has a bigger set of CSRs, we don't tail call when caller and callee have mismatched calling conventions. llvm-svn: 263856
*	[CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.	Manman Ren	2016-03-18	3	-1/+3
\| \| \| \| \| \| \|	Since at O0, explicit copies via SplitCSR may not be removed even if they are unnecessary, we choose not to use SplitCSR at O0. llvm-svn: 263855
*	AArch64: Don't modify other modules in AArch64PromoteConstant	Duncan P. N. Exon Smith	2016-03-18	1	-148/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Avoid modifying other modules in `AArch64PromoteConstant` when the constant is `ConstantData` (a horrible accident, I'm sure, caught by an experimental follow-up to r261464). Previously, this walked through all the users of a constant, but that reaches into other modules when the constant doesn't depend transitively on a `GlobalValue`! Since we're walking instructions anyway, just modify the instructions we actually see. As a drive-by, instead of storing `Use` and getting the instructions again via `Use::getUser()` (which is not a constantant time lookup), store `std::pair<Instruction, unsigned>`. Besides being cheaper, this makes it easier to drop use-lists form `ConstantData` in the future. (I threw this in because I was touching all the code anyway.) Because the patch completely changes the traversal logic, it looks like a rewrite of the pass, but the core logic is all the same (or should be, minus the out-of-module changes). In other words, there should be NFC as long as the LLVMContext only has a single Module. I didn't think of a good way to test this, but I hope to submit a patch eventually that makes walking these use-lists illegal/impossible. llvm-svn: 263853
*	[sancov] clang-formatting SanitizerCoverage.cpp and fully pleasing clang-tidy.	Mike Aizatsky	2016-03-18	1	-72/+78
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D18288 llvm-svn: 263852
*	Revert "Revert "[sancov] specifying sanitizer coverage dependencies.""	Chandler Carruth	2016-03-18	1	-1/+7
\| \| \| \| \| \|	This reverts commit r263825, re-instating r263797. llvm-svn: 263847
*	[sancov] Fix the sancov pass to initialize itself inside its	Chandler Carruth	2016-03-18	1	-1/+3
\| \| \| \| \| \| \|	constructor. This should fix the recent crashes on certain architectures. llvm-svn: 263845
*	BPF: emit an error message for unsupported signed division operation	Alexei Starovoitov	2016-03-18	1	-0/+12
\| \| \| \| \| \|	Signed-off-by: Yonghong Song <yhs@plumgrid.com> Signed-off-by: Alexei Starovoitov <ast@fb.com> llvm-svn: 263842
*	Interface to get/set profile summary metadata to module	Easwaran Raman	2016-03-18	1	-0/+8
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D17894 llvm-svn: 263835
*	[libFuzzer] add a flag close_fd_mask so that we can silence spammy targets ↵	Kostya Serebryany	2016-03-18	7	-1/+74
\| \| \| \| \| \|	by closing stderr/stdout llvm-svn: 263831
*	MILexer: Add ErrorCallbackType typedef; NFC	Matthias Braun	2016-03-18	1	-30/+22
\| \| \| \|	llvm-svn: 263829
*	[IndVars] Make the fix for PR26973 more obvious; NFCI	Sanjoy Das	2016-03-18	1	-3/+42
\| \| \| \|	llvm-svn: 263828
*	[IndVars] Pass the right loop to isLoopInvariantPredicate	Sanjoy Das	2016-03-18	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \|	The loop on IVOperand's incoming values assumes IVOperand to be an induction variable on the loop over which `S Pred X` is invariant; otherwise loop invariant incoming values to IVOperand are not guaranteed to dominate the comparision. This fixes PR26973. llvm-svn: 263827
*	Revert "[sancov] specifying sanitizer coverage dependencies."	Mike Aizatsky	2016-03-18	1	-7/+1
\| \| \| \| \| \| \| \|	This fails on arm. This reverts commit 52c8e0f7119d1ea1050c0708565a8c92b73386d2. llvm-svn: 263825
*	AMDGPU: add missing braces around multi-line if block	Nicolai Haehnle	2016-03-18	1	-1/+2
\| \| \| \| \| \|	This fixes an issue with rL263658 pointed out by Tom Stellard. llvm-svn: 263823
*	[AArch64] Enable more load clustering in the MI Scheduler.	Chad Rosier	2016-03-18	3	-36/+116
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds unscaled loads and sign-extend loads to the TII getMemOpBaseRegImmOfs API, which is used to control clustering in the MI scheduler. This is done to create more opportunities for load pairing. I've also added the scaled LDRSWui instruction, which was missing from the scaled instructions. Finally, I've added support in shouldClusterLoads for clustering adjacent sext and zext loads that too can be paired by the load/store optimizer. Differential Revision: http://reviews.llvm.org/D18048 llvm-svn: 263819
*	[codeview] Only emit function ids for inlined functions	Reid Kleckner	2016-03-18	2	-54/+76
\| \| \| \| \| \| \|	We aren't referencing any other kind of function currently. Should save a bit on our debug info size. llvm-svn: 263817
*	[MCParser] Accept uppercase radix variants 0X and 0B	Colin LeMahieu	2016-03-18	2	-4/+4
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D14781 llvm-svn: 263802
*	[sancov] specifying sanitizer coverage dependencies.	Mike Aizatsky	2016-03-18	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These dependencies would be used in the future to reduce the number of instrumented blocks(http://reviews.llvm.org/rL262103) This is submitted as a separate CL because of previous problems with ARM. Subscribers: aemerson Differential Revision: http://reviews.llvm.org/D18227 llvm-svn: 263797
*	AMDGPU: Overload return type of llvm.amdgcn.buffer.load.format	Nicolai Haehnle	2016-03-18	1	-36/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Allow the selection of BUFFER_LOAD_FORMAT_x and _XY. Do this now before the frontend patches land in Mesa. Eventually, we may want to automatically reduce the size of loads at the LLVM IR level, which requires such overloads, and in some cases Mesa can generate them directly. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18255 llvm-svn: 263792
*	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics	Nicolai Haehnle	2016-03-18	3	-2/+187
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 llvm-svn: 263791
*	AMDGPU: use ComplexPattern for offsets in llvm.amdgcn.buffer.load/store.format	Nicolai Haehnle	2016-03-18	3	-13/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We cannot easily deduce that an offset is in an SGPR, but the Mesa frontend cannot easily make use of an explicit soffset parameter either. Furthermore, it is likely that in the future, LLVM will be in a better position than the frontend to choose an SGPR offset if possible. Since there aren't any frontend uses of these intrinsics in upstream repositories yet, I would like to take this opportunity to change the intrinsic signatures to a single offset parameter, which is then selected to immediate offsets or voffsets using a ComplexPattern. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18218 llvm-svn: 263790
*	[AMDGPU] Assembler: Change dpp_ctrl syntax to match sp3	Sam Kolton	2016-03-18	2	-50/+95
\| \| \| \| \|	Review: http://reviews.llvm.org/D18267 llvm-svn: 263789
*	[Fuzzer] Guard no_sanitize_memory attributes behind __has_feature.	Benjamin Kramer	2016-03-18	1	-2/+10
\| \| \| \| \| \|	Otherwise GCC fails to build it because it doesn't know the attribute. llvm-svn: 263787
*	adding another optimization opportunity to readme file	Ehsan Amiri	2016-03-18	1	-0/+11
\| \| \| \|	llvm-svn: 263775
*	[libFuzzer] read corpus dirs recursively	Kostya Serebryany	2016-03-18	2	-14/+25
\| \| \| \|	llvm-svn: 263773
*	[LoopDataPrefetch] Add TTI to limit the number of iterations to prefetch ahead	Adam Nemet	2016-03-18	4	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: It can hurt performance to prefetch ahead too much. Be conservative for now and don't prefetch ahead more than 3 iterations on Cyclone. Reviewers: hfinkel Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17949 llvm-svn: 263772
*	[LoopDataPrefetch/Aarch64] Allow selective prefetching of large-strided accesses	Adam Nemet	2016-03-18	4	-0/+43
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: And use this TTI for Cyclone. As it was explained in the original RFC (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758), the HW prefetcher work up to 2KB strides. I am also adding tests for this and the previous change (D17943): * Cyclone prefetching accesses with a large stride * Cyclone not prefetching accesses with a small stride * Generic Aarch64 subtarget not prefetching either Reviewers: hfinkel Subscribers: aemerson, rengolin, llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D17945 llvm-svn: 263771
*	[Aarch64] Add pass LoopDataPrefetch for Cyclone	Adam Nemet	2016-03-18	3	-0/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This wires up the pass for Cyclone but keeps it off for now because we need a few more TTIs. The getPrefetchMinStride value is not very well tuned right now but it works well with CFP2006/433.milc which motivated this. Tests will be added as part of the upcoming large-stride prefetching patch. Reviewers: t.p.northover Subscribers: llvm-commits, aemerson, hfinkel, rengolin Differential Revision: http://reviews.llvm.org/D17943 llvm-svn: 263770
*	[libFuzzer] improve -merge functionality	Kostya Serebryany	2016-03-18	6	-73/+101
\| \| \| \|	llvm-svn: 263769
*	DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual ↵	Peter Collingbourne	2016-03-17	3	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \|	functions. A virtual index of -1u indicates that the subprogram's virtual index is unrepresentable (for example, when using the relative vtable ABI), so do not emit a DW_AT_vtable_elem_location attribute for it. Differential Revision: http://reviews.llvm.org/D18236 llvm-svn: 263765
*	[PPC, FastISel] Fix ordered/unordered fcmp	Tim Shen	2016-03-17	1	-7/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For fcmp, major concern about the following 6 cases is NaN result. The comparison result consists of 4 bits, indicating lt, eq, gt and un (unordered), only one of which will be set. The result is generated by fcmpu instruction. However, bc instruction only inspects one of the first 3 bits, so when un is set, bc instruction may jump to to an undesired place. More specifically, if we expect an unordered comparison and un is set, we expect to always go to true branch; in such case UEQ, UGT and ULT still give false, which are undesired; but UNE, UGE, ULE happen to give true, since they are tested by inspecting !eq, !lt, !gt, respectively. Similarly, for ordered comparison, when un is set, we always expect the result to be false. In such case OGT, OLT and OEQ is good, since they are actually testing GT, LT, and EQ respectively, which are false. OGE, OLE and ONE are tested through !lt, !gt and !eq, and these are true. llvm-svn: 263753
*	[LoopVectorize] Annotate versioned loop with noalias metadata	Adam Nemet	2016-03-17	2	-23/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use the new LoopVersioning facility (D16712) to add noalias metadata in the vector loop if we versioned with memchecks. This can enable some optimization opportunities further down the pipeline (see the included test or the benchmark improvement quoted in D16712). The test also covers the bug I had in the initial version in D16712. The vectorizer did not previously use LoopVersioning. The reason is that the vectorizer performs its transformations in single shot. It creates an empty single-block vector loop that it then populates with the widened, if-converted instructions. Thus creating an intermediate versioned scalar loop seems wasteful. So this patch (rather than bringing in LoopVersioning fully) adds a special interface to LoopVersioning to allow the vectorizer to add no-alias annotation while still performing its own versioning. As the vectorizer propagates metadata from the instructions in the original loop to the vector instructions we also check the pointer in the original instruction and see if LoopVersioning can add no-alias metadata based on the issued memchecks. Reviewers: hfinkel, nadav, mzolotukhin Subscribers: mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17191 llvm-svn: 263744
*	[LoopVersioning] Annotate versioned loop with noalias metadata	Adam Nemet	2016-03-17	2	-0/+90
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: If we decide to version a loop to benefit a transformation, it makes sense to record the now non-aliasing accesses in the newly versioned loop. This allows non-aliasing information to be used by subsequent passes. One example is 456.hmmer in SPECint2006 where after loop distribution, we vectorize one of the newly distributed loops. To vectorize we version this loop to fully disambiguate may-aliasing accesses. If we add the noalias markers, we can use the same information in a later DSE pass to eliminate some dead stores which amounts to ~25% of the instructions of this hot memory-pipeline-bound loop. The overall performance improves by 18% on our ARM64. The scoped noalias annotation is added in LoopVersioning. The patch then enables this for loop distribution. A follow-on patch will enable it for the vectorizer. Eventually this should be run by default when versioning the loop but first I'd like to get some feedback whether my understanding and application of scoped noalias metadata is correct. Essentially my approach was to have a separate alias domain for each versioning of the loop. For example, if we first version in loop distribution and then in vectorization of the distributed loops, we have a different set of memchecks for each versioning. By keeping the scopes in different domains they can conveniently be defined independently since different alias domains don't affect each other. As written, I also have a separate domain for each loop. This is not necessary and we could save some metadata here by using the same domain across the different loops. I don't think it's a big deal either way. Probably the best is to review the tests first to see if I mapped this problem correctly to scoped noalias markers. I have plenty of comments in the tests. Note that the interface is prepared for the vectorizer which needs the annotateInstWithNoAlias API. The vectorizer does not use LoopVersioning so we need a way to pass in the versioned instructions. This is also why the maps have to become part of the object state. Also currently, we only have an AA-aware DSE after the vectorizer if we also run the LTO pipeline. Depending how widely this triggers we may want to schedule a DSE toward the end of the regular pass pipeline. Reviewers: hfinkel, nadav, ashutosh.nema Subscribers: mssimpso, aemerson, llvm-commits, mcrosier Differential Revision: http://reviews.llvm.org/D16712 llvm-svn: 263743
*	Bitcode: Error out instead of crashing on corrupt metadata	Justin Bogner	2016-03-17	1	-20/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I hit a crash in the bitcode reader on some corrupt input where an MDString had somehow been attached to an instruction instead of an MDNode. This input is pretty bogus, but we shouldn't be crashing on bad input here. This change adds error handling in all of the places where we currently have unchecked casts from Metadata to MDNode, which means we'll error out instead of crashing for that sort of input. Unfortunately, I don't have tests. Hitting this requires flipping bits in the input bitcode, and committing corrupt binary files to catch these cases is a bit too opaque and unmaintainable. llvm-svn: 263742
*	ARM: stop asserting on weird <3 x Ty> vectors in ISelLowering.	Tim Northover	2016-03-17	1	-2/+3
\| \| \| \|	llvm-svn: 263741
*	[libFuzzer] deprecate several flags	Kostya Serebryany	2016-03-17	7	-51/+10
\| \| \| \|	llvm-svn: 263739
*	[libFuzzer] add __attribute__((no_sanitize_memory)) to two functions that ↵	Kostya Serebryany	2016-03-17	1	-0/+2
\| \| \| \| \| \|	may be called from signal handler(s) or from msan. This will hopefully avoid msan false reports which I can't reproduce llvm-svn: 263737
*	[InstCombine] Combine A->B->A BitCast	Guozhi Wei	2016-03-17	2	-0/+107
\| \| \| \| \| \| \| \| \| \|	This patch enhances InstCombine to handle following case: A -> B bitcast PHI B -> A bitcast llvm-svn: 263734
*	[Statepoints] Export a magic constant into a header; NFC	Sanjoy Das	2016-03-17	1	-1/+1
\| \| \| \|	llvm-svn: 263733
*	[PowerPC] Disable CTR loops optimization for soft float operations	Petar Jovanovic	2016-03-17	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \|	This patch prevents CTR loops optimization when using soft float operations inside loop body. Soft float operations use function calls, but function calls are not allowed inside CTR optimized loops. Patch by Aleksandar Beserminji. Differential Revision: http://reviews.llvm.org/D17600 llvm-svn: 263727
*	[WebAssembly] Stackify code emitted by eliminateFrameIndex and SP writeback	Derek Schuff	2016-03-17	2	-19/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: MRI::eliminateFrameIndex can emit several instructions to do address calculations; these can usually be stackified. Because instructions with FI operands can have subsequent operands which may be expression trees, find the top of the leftmost tree and insert the code before it, to keep the LIFO property. Also use stackified registers when writing back the SP value to memory in the epilog; it's unnecessary because SP will not be used after the epilog, and it results in better code. Differential Revision: http://reviews.llvm.org/D18234 llvm-svn: 263725
*	[COFF] Refactor section alignment calculation	David Majnemer	2016-03-17	1	-1/+1
\| \| \| \| \| \| \|	Section alignment isn't completely trivial, let it live in one place so that we may reuse it in LLVM. llvm-svn: 263722
*	Forgot to commit this with r263692	David Majnemer	2016-03-17	1	-1/+1
\| \| \| \|	llvm-svn: 263721
*	AMDGPU/SI: Do not generate s_waitcnt after ds_permute/ds_bpermute	Changpeng Fang	2016-03-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Symmary: ds_permute/ds_bpermute do not read memory so s_waitcnt is not needed. Reviewers arsenm, tstellarAMD Subscribers llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18197 llvm-svn: 263720
*	AMDGPU: mark atomic instructions as sources of divergence	Nicolai Haehnle	2016-03-17	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 llvm-svn: 263719
*	[X86][SSE] Simplified blend-with-zero combining	Simon Pilgrim	2016-03-17	1	-14/+13
\| \| \| \| \| \| \| \|	We were being too aggressive in trying to combine a shuffle into a blend-with-zero pattern, often resulting in a endless loop of contrasting combines This patch stops the combine if we already have a blend in place (means we miss some domain corrections) llvm-svn: 263717
*	propagate 'unpredictable' metadata on select instructions	Sanjay Patel	2016-03-17	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \|	This is similar to D18133 where we allowed profile weights on select instructions. This extends that change to also allow the 'unpredictable' attribute of branches to apply to selects. A test to check that 'unpredictable' metadata is preserved when cloning instructions was checked in at: http://reviews.llvm.org/rL263648 Differential Revision: http://reviews.llvm.org/D18220 llvm-svn: 263716
*	ARM: Revert SVN r253865, 254158, fix windows division	Saleem Abdulrasool	2016-03-17	1	-7/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The two changes together weakened the test and caused a regression with division handling in MSVC mode. They were applied to avoid an assertion being triggered in the block frequency analysis. However, the underlying problem was simply being masked rather than solved properly. Address the actual underlying problem and revert the changes. Rather than analyze the cause of the assertion, the division failure was assumed to be an overflow. The underlying issue was a subtle bug in the BB construction in the emission of the div-by-zero check (WIN__DBZCHK). We did not construct the proper successor information in the basic blocks, nor did we update the PHIs associated with the basic block when we split them. This would result in assertions being triggered in the block frequency analysis pass. Although the original tests are being removed, the tests themselves performed very little in terms of validation but merely tested that we did not assert when generating code. Update this with new tests that actually ensure that we do not regress on the code generation. llvm-svn: 263714