bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Xray docs with description of Flight Data Recorder binary format.	Keith Wyss	2017-08-02	4	-5/+411
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adding a new restructuredText file to document the trace format produced with an FDR mode handler and read by llvm-xray toolset. Fixed two problems in the documentation from differential review. One bad table and a missing link in the toc. Original commit was e97c5836a77db803fe53319c53f3bf8e8b26d2b7. Reviewers: dberris, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36041 llvm-svn: 309891
*	LV: Don't insert runtime ptr checks on divergent targets	Matt Arsenault	2017-08-02	2	-0/+41
\| \| \| \|	llvm-svn: 309890
*	[libFuzzer tests] Use substring comparison in libFuzzer tests	George Karpenkov	2017-08-02	1	-1/+1
\| \| \| \| \| \| \| \| \|	LIT launches executables with absolute, and not relative, path. strncmp would try to do exact comparison and fail. Differential Revision: https://reviews.llvm.org/D36242 llvm-svn: 309889
*	[InstCombine] Remove unnecessary temporary APInt. NFCI	Craig Topper	2017-08-02	1	-6/+1
\| \| \| \|	llvm-svn: 309887
*	[PM] Split LoopUnrollPass and make partial unroller a function pass	Teresa Johnson	2017-08-02	19	-114/+180
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is largely NFC, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 llvm-svn: 309886
*	Don't pass the code model to MC	Rafael Espindola	2017-08-02	17	-59/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I was surprised to see the code model being passed to MC. After all, it assembles code, it doesn't create it. The one place it is used is in the expansion of .cfi directives to handle .eh_frame being more that 2gb away from the code. As far as I can tell, gnu assembler doesn't even have an option to enable this. Compiling a c file with gcc -mcmodel=large produces a regular looking .eh_frame. This is probably because in practice linker parse and recreate .eh_frames. In llvm this is used because the JIT can place the code and .eh_frame very far apart. Ideally we would fix the jit and delete this option. This is hard. Apart from confusion another problem with the current interface is that most callers pass CodeModel::Default, which is bad since MC has no way to map it to the target default if it actually needed to. This patch then replaces the argument with a boolean with a default value. The vast majority of users don't ever need to look at it. In fact, only CodeGen and llvm-mc use it and llvm-mc just to enable more testing. llvm-svn: 309884
*	[InstCombine] Remove explicit code for folding (xor(zext(cmp)), 1) and ↵	Craig Topper	2017-08-02	1	-15/+0
\| \| \| \| \| \| \| \| \| \|	(xor(sext(cmp)), -1) to ext(!cmp). As far as I can tell this should be handled by foldCastedBitwiseLogic which is called later in visitXor. Differential Revision: https://reviews.llvm.org/D36214 llvm-svn: 309882
*	[InstCombine] Support sext in foldLogicCastConstant	Craig Topper	2017-08-02	4	-16/+26
\| \| \| \| \| \| \| \|	This adds support for sext in foldLogicCastConstant. This is a prerequisite for D36214. Differential Revision: https://reviews.llvm.org/D36234 llvm-svn: 309880
*	DebugInfo: Test & handle (differently) non-zero DW_AT_ranges_base	David Blaikie	2017-08-02	6	-17/+28
\| \| \| \| \| \| \| \| \|	Followup to r309570, fixing it slightly differently (ranges_base and addr_base should never be read from a DWO file - so there shouldn't be any issue with 'overriding' the values - conditionalize the code and assert that the values aren't being overriden). llvm-svn: 309879
*	[Power9] Exploit vector absolute difference instructions on Power 9	Stefan Pintilie	2017-08-02	3	-1/+410
\| \| \| \| \| \| \| \| \|	Power 9 has instructions to do absolute difference (VABSDUB, VABSDUH, VABSDUW) for byte, halfword and word. We should take advantage of these. Differential Revision: https://reviews.llvm.org/D34684 llvm-svn: 309876
*	[NewGVN] Now that load coercion is enable, we pass this test.	Davide Italiano	2017-08-02	1	-1/+0
\| \| \| \|	llvm-svn: 309872
*	[AArch64] Add Exynos M2 feature test (NFC)	Evandro Menezes	2017-08-02	1	-0/+1
\| \| \| \| \| \|	Test fusion of AES operations. llvm-svn: 309855
*	[Dominators] Teach LoopDeletion to use the new incremental API	Jakub Kuderski	2017-08-02	4	-25/+136
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch makes LoopDeletion use the incremental DominatorTree API. We modify LoopDeletion to perform the deletion in 5 steps: 1. Create a new dummy edge from the preheader to the exit, by adding a conditional branch. 2. Inform the DomTree about the new edge. 3. Remove the conditional branch and replace it with an unconditional edge to the exit. This removes the edge to the loop header, making it unreachable. 4. Inform the DomTree about the deleted edge. 5. Remove the unreachable block from the function. Creating the dummy conditional branch is necessary to perform incremental DomTree update. We should consider using the batch updater when it's ready. Reviewers: dberlin, davide, grosser, sanjoy Reviewed By: dberlin, grosser Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35391 llvm-svn: 309850
*	[StackColoring] Update AliasAnalysis information in stack coloring pass (part 2)	Hiroshi Inoue	2017-08-02	2	-11/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is update after the first patch (https://reviews.llvm.org/rL309651) based on the post-commit comments. Stack coloring pass need to maintain AliasAnalysis information when merging stack slots of different types. Actually, there is a FIXME comment in StackColoring.cpp // FIXME: In order to enable the use of TBAA when using AA in CodeGen, // we'll also need to update the TBAA nodes in MMOs with values // derived from the merged allocas. But, TBAA has been already enabled in CodeGen without fixing this pass. The incorrect TBAA metadata results in recent failures in bootstrap test on ppc64le (PR33928) by allowing unsafe instruction scheduling. Although we observed the problem on ppc64le, this is a platform neutral issue. This patch makes the stack coloring pass maintains AliasAnalysis information when merging multiple stack slots. This patch fixes PR33928. llvm-svn: 309849
*	Revert "Xray docs with description of Flight Data Recorder binary format."	Keith Wyss	2017-08-02	3	-403/+5
\| \| \| \| \| \| \| \|	This reverts commit 3462b8ad41a840fd54dbbd0d3f2a514c5ad6f656. The docs-llvm-html target failed. llvm-svn: 309842
*	[AsmParser][GAS-compatibility] Ignore an empty 'p2align' directive	Coby Tayree	2017-08-02	2	-2/+14
\| \| \| \| \| \| \| \| \|	GAS ignores the aforementioned issue this patch aligns LLVM + throws in an appropriate warning Differential Revision: https://reviews.llvm.org/D36060 llvm-svn: 309841
*	[InstCombine] Add missing test case for (xor (sext (cmp)), -1) -> (sext (!cmp)).	Craig Topper	2017-08-02	1	-0/+14
\| \| \| \|	llvm-svn: 309839
*	Xray docs with description of Flight Data Recorder binary format.	Keith Wyss	2017-08-02	3	-5/+403
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Adding a new restructuredText file to document the trace format produced with an FDR mode handler and read by llvm-xray toolset. Reviewers: dberris, pelikan Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D36041 llvm-svn: 309836
*	Assert that the offset of a DBG_VALUE is always 0. (NFC)	Adrian Prantl	2017-08-02	2	-4/+8
\| \| \| \|	llvm-svn: 309834
*	Revert "[lit] Avoid copying llvm/utils/lit/tests/Inputs with lit site configs"	Reid Kleckner	2017-08-02	5	-31/+29
\| \| \| \| \| \| \|	This reverts r309602, check-lit still leaves Output directories in the source directory. llvm-svn: 309833
*	AMDGPU: Restore using MRI to find highest used regs	Matt Arsenault	2017-08-02	1	-5/+23
\| \| \| \| \| \| \| \| \| \|	If there are no calls, this is a faster path than searching the entire program for calls. This was supposed to be left in r309781. Fixes unused variable warning. llvm-svn: 309832
*	Remove the unused Offset field from MachineLocation (NFC)	Adrian Prantl	2017-08-02	3	-37/+8
\| \| \| \| \| \|	rdar://problem/33580047 llvm-svn: 309831
*	[DAG] Improve candidate pruning in store merge failure case. NFCI	Nirav Dave	2017-08-02	1	-20/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During store merge we construct a sorted list of consecutive store candidates and consider subsequences for merging into a single store. For each subsequence we check if the stored value type is legal the merged store would have valid and fast and if the constructed value to be stored is valid. The only properties that affect this check between subsequences is the size of the subsequence, the alignment of the first store, the alignment of the stored load value (when merging stores-of-loads), and whether the merged value is a constant zero. If we do not find a viable mergeable subsequence starting from the first store of length N, we know that a subsequence starting at a later store of length N will also fail unless the new store's alignment, the new load's alignment (if we're merging store-of-loads), or we've dropped stores of nonzero value and could construct a merged stores of zero (for merging constants). As a result if we fail to find a valid subsequence starting from the first store we can safely skip considering subsequences that start with subsequent stores unless one of the above properties is true. This significantly (2x) improves compile time in some pathological cases. Reviewers: RKSimon, efriedma, zvi, spatel, waltl Subscribers: grandinj, llvm-commits Differential Revision: https://reviews.llvm.org/D35901 llvm-svn: 309830
*	[AArch64] Improve the test of conditional branch fusion	Evandro Menezes	2017-08-02	1	-5/+25
\| \| \| \| \| \|	Separate the checking of the fused pairings with B.cc and CBcc. llvm-svn: 309825
*	Remove unused includes of MachineLocation.h (NFC)	Adrian Prantl	2017-08-02	3	-2/+1
\| \| \| \|	llvm-svn: 309824
*	Remove unreachable code. (NFC)	Adrian Prantl	2017-08-02	4	-26/+5
\| \| \| \| \| \| \| \|	MachineLocation::getOffset() always returns 0. rdar://problem/33580047 llvm-svn: 309823
*	[AArch64] Simplify AES*Tied pseudo expansion (NFC).	Florian Hahn	2017-08-02	1	-10/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Suggested by @t.p.northover in https://bugs.llvm.org/show_bug.cgi?id=34015. Reviewers: javed.absar, t.p.northover, rengolin Reviewed By: t.p.northover Subscribers: aemerson, kristof.beyls, llvm-commits, t.p.northover Differential Revision: https://reviews.llvm.org/D36223 llvm-svn: 309821
*	[InlineCost] Remove redundant call. NFC.	Chad Rosier	2017-08-02	1	-2/+3
\| \| \| \|	llvm-svn: 309819
*	Assert that the offset in MachineLocation::set() is always 0. (NFC)	Adrian Prantl	2017-08-02	1	-0/+1
\| \| \| \|	llvm-svn: 309818
*	[InlineCost] Simplify more 'and' and 'or' operations.	Chad Rosier	2017-08-02	2	-0/+124
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D35856 llvm-svn: 309817
*	[SLPVectorizer] Generalize interface of functions, NFC.	Alexey Bataev	2017-08-02	2	-13/+17
\| \| \| \|	llvm-svn: 309816
*	[SLPVectorizer] Test update, NFC.	Alexey Bataev	2017-08-02	1	-30/+151
\| \| \| \|	llvm-svn: 309814
*	[SLP] Fix for PR31880: shuffle and vectorize repeated scalar ops on ↵	Alexey Bataev	2017-08-02	2	-65/+167
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	extracted elements Summary: Currently most of the time vectors of extractelement instructions are treated as scalars that must be gathered into vectors. But in some cases, like when we have extractelement instructions from single vector with different constant indeces or from 2 vectors of the same size, we can treat this operations as shuffle of a single vector or blending of 2 vectors. ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %x0 = extractelement <2 x i8> %x, i32 0 %y1 = extractelement <2 x i8> %y, i32 1 %x0x0 = mul i8 %x0, %x0 %y1y1 = mul i8 %y1, %y1 %ins1 = insertelement <2 x i8> undef, i8 %x0x0, i32 0 %ins2 = insertelement <2 x i8> %ins1, i8 %y1y1, i32 1 ret <2 x i8> %ins2 } ``` can be converted to something like ``` define <2 x i8> @g(<2 x i8> %x, <2 x i8> %y) { %1 = shufflevector <2 x i8> %x, <2 x i8> %y, <2 x i32> <i32 0, i32 3> %2 = mul <2 x i8> %1, %1 ret <2 x i8> %2 } ``` Currently this type of conversion is considered as high cost transformation. Reviewers: mzolotukhin, delena, mkuper, hfinkel, RKSimon Subscribers: ashahid, RKSimon, spatel, llvm-commits Differential Revision: https://reviews.llvm.org/D30200 llvm-svn: 309812
*	[MIR] Print target-specific constant pools	Diana Picus	2017-08-02	5	-7/+50
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should enable us to test the generation of target-specific constant pools, e.g. for ARM: constants: - id: 0 value: 'g(GOT_PREL)-(LPC0+8-.)' alignment: 4 isTargetSpecific: true I intend to use this to test PIC support in GlobalISel for ARM. This is difficult to test outside of that context, since the existing MIR tests usually rely on parser support as well, and that seems a bit trickier to add. We could try to add a unit test, but the setup for that seems rather convoluted and overkill. We do test however that the parser reports a nice error when encountering a target-specific constant pool. Differential Revision: https://reviews.llvm.org/D36092 llvm-svn: 309806
*	[globalisel][tablegen] Do not merge memoperands from instructions that ↵	Daniel Sanders	2017-08-02	4	-19/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	weren't in the match. Summary: Fix a bug discovered in an out-of-tree target where memoperands from pseudo-instructions that weren't part of the match were being merged into the result instructions as part of GIR_MergeMemOperands. This bug was caused by a change to the handling of State.MIs between rules when the state machine tables were fused into a single table. Previously, each rule would reset State.MIs using State.MIs.resize(1) but this is no longer done, as a result stale data is occasionally left in some elements of State.MIs. Most opcodes aren't affected by this but GIR_MergeMemOperands merges all memoperands from the intructions recorded in State.MIs into the result instruction. Suppose for example, we processed but rejected the following pattern: (signextend (load x)) at this point, State.MIs contains the signextend and the load. Now suppose we process and accept this pattern: (add x, y) at this point, State.MIs contains the add as well as the (now irrelevant) load. When GIR_MergeMemOperands is processed, the memoperands from that irrelevant load will be merged into the result instruction even though it was not part of the match. Bringing back the State.MIs.resize(1) would fix the problem but it would limit our ability to optimize the table in the future. Instead, this patch fixes the problem by explicitly stating which instructions should be merged into the result. There's no direct test case in this commit because a test case would be very brittle. However, at the time of writing this should fix the failures in http://green.lab.llvm.org/green/job/Compiler_Verifiers_GlobalISEL/ as well as a failure in test/CodeGen/ARM/GlobalISel/arm-isel.ll when expensive checks are enabled. Reviewers: ab, t.p.northover, qcolombet, rovka, aditya_nandakumar Subscribers: fhahn, kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D36094 llvm-svn: 309804
*	[InstCombine] Add test cases for 'or' and 'xor' to match the vector 'and' of ↵	Craig Topper	2017-08-02	1	-0/+37
\| \| \| \| \| \| \| \| \| \|	'sext' of 'cmp' test. When the 'and' test was originally added it was intended to make sure we didn't change it to a sext of and of cmp. But since then the test was changed to expect it to be turned into 'select cmp1, sext cmp2, 0'. Then another optimization was added to turn the select into 'sext (and cmp1, cmp2)' which is exactly the transformation that was being blocked when the test case started. Looks like 'or' gets optimized in a similar way, but not 'xor'. llvm-svn: 309793
*	[NewGVN] Fold single-use variables. NFCI.	Davide Italiano	2017-08-02	1	-5/+3
\| \| \| \|	llvm-svn: 309790
*	[NewGVN] Remove a (now stale) comment. NFCI.	Davide Italiano	2017-08-02	1	-1/+0
\| \| \| \|	llvm-svn: 309789
*	Fix the bug that parseAAPipeline is not invoked in runNewPMPasses in release ↵	Dehao Chen	2017-08-02	4	-8/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	compiler. Summary: The logic is guarded by "assert". Reviewers: davidxl, davide, chandlerc Reviewed By: davide, chandlerc Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36195 llvm-svn: 309787
*	[SimplifyCFG] Fix typo in comment. NFC	Craig Topper	2017-08-02	1	-1/+1
\| \| \| \|	llvm-svn: 309785
*	[PM] Fix a bug where through CGSCC iteration we can get	Chandler Carruth	2017-08-02	3	-5/+181
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	infinite-inlining across multiple runs of the inliner by keeping a tiny history of internal-to-SCC inlining decisions. This is still a bit gross, but I don't yet have any fundamentally better ideas and numerous people are blocked on this to use new PM and ThinLTO together. The core of the idea is to detect when we are about to do an inline that has a chance of re-splitting an SCC which we have split before with a similar inlining step. That is a critical component in the inlining forming a cycle and so far detects all of the various cyclic patterns I can come up with as well as the original real-world test case (which comes from a ThinLTO build of libunwind). I've added some tests that I think really demonstrate what is going on here. They are essentially state machines that march the inliner through various steps of a cycle and check that we stop when the cycle is closed and that we actually did do inlining to form that cycle. A lot of thanks go to Eric Christopher and Sanjoy Das for the help understanding this issue and improving the test cases. The biggest "yuck" here is the layering issue -- the CGSCC pass manager is providing somewhat magical state to the inliner for it to use to make itself converge. This isn't great, but I don't honestly have a lot of better ideas yet and at least seems nicely isolated. I have tested this patch, and it doesn't block any inlining on the entire LLVM test suite and SPEC, so it seems sufficiently narrowly targeted to the issue at hand. We have come up with hypothetical issues that this patch doesn't cover, but so far none of them are practical and we don't have a viable solution yet that covers the hypothetical stuff, so proceeding here in the interim. Definitely an area that we will be back and revisiting in the future. Differential Revision: https://reviews.llvm.org/D36188 llvm-svn: 309784
*	AMDGPU: Fix clobbering CSR VGPRs when spilling SGPR to it	Matt Arsenault	2017-08-02	7	-24/+105
\| \| \| \|	llvm-svn: 309783
*	AMDGPU: Fix emitting encoded calls	Matt Arsenault	2017-08-02	3	-4/+30
\| \| \| \| \| \| \| \| \| \|	This was failing on out of bounds access to the extra operands on the s_swappc_b64 beyond those in the instruction definition. This was working, but somehow regressed within the past few weeks, although I don't see any obvious commit. llvm-svn: 309782
*	AMDGPU: Analyze callee resource usage in AsmPrinter	Matt Arsenault	2017-08-02	7	-22/+425
\| \| \| \|	llvm-svn: 309781
*	Update the new PM pipeline to make ICP aware if it is SamplePGO build.	Dehao Chen	2017-08-02	4	-25/+81
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In ThinLTO backend compile, OPTOptions are not set so that the ICP in ThinLTO backend does not know if it is a SamplePGO build, in which profile count needs to be annotated directly on call instructions. This patch cleaned up the PGOOptions handling logic and passes down PGOOptions to ThinLTO backend. Reviewers: chandlerc, tejohnson, davidxl Reviewed By: chandlerc Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36052 llvm-svn: 309780
*	[AMDGPU] Fix asan error after last commit	Stanislav Mekhanoshin	2017-08-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Previous change "Turn s_and_saveexec_b64 into s_and_b64 if result is unused" introduced asan use-after-poison error. Instruction was analyzed after eraseFromParent() calls. Move analysys higher than erase. llvm-svn: 309779
*	[DAG] Refactor store merge subexpressions. NFC.	Nirav Dave	2017-08-02	1	-23/+28
\| \| \| \| \| \|	Distribute various expressions across ifs. llvm-svn: 309777
*	AMDGPU: Don't place arguments in emergency stack slot	Matt Arsenault	2017-08-02	5	-129/+139
\| \| \| \| \| \| \| \|	When finding the fixed offsets for function arguments, this needs to skip over the 4 bytes reserved for the emergency stack slot. llvm-svn: 309776
*	DAG: Undo and->or combine with FrameIndexes	Matt Arsenault	2017-08-02	4	-85/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pattern shows up when lowering byval copies on AMDGPU. The byval object access is split into 4-byte chunks, adding a constant offset to the FixedStack base. When some of the offsets turn into ors, this prevents combining the constant offsets. This makes it not apparent that the object is there when matching addressing modes, so it ends up using a scratch wave offset relative access and the lengthy frame index expansion for that. llvm-svn: 309775
*	X86: Do not use llc -march in tests.	Matthias Braun	2017-08-02	729	-964/+960
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`llc -march` is problematic because it only switches the target architecture, but leaves the operating system unchanged. This occasionally leads to indeterministic tests because the OS from LLVM_DEFAULT_TARGET_TRIPLE is used. However we can simply always use `llc -mtriple` instead. This changes all the tests to do this to avoid people using -march when they copy and paste parts of tests. See also the discussion in https://reviews.llvm.org/D35287 llvm-svn: 309774