bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[ThinLTO] Add caching to the new LTO API	Mehdi Amini	2016-08-23	4	-32/+248
\| \| \| \| \| \| \| \| \| \| \| \|	Add the ability to plug a cache on the LTO API. I tried to write such that a linker implementation can control the cache backend. This is intrusive and I'm not totally happy with it, but I can't figure out a better design right now. Differential Revision: https://reviews.llvm.org/D23599 llvm-svn: 279576
*	[InstCombine] move foldICmpShrConstConst() contents to foldICmpShrConst(); NFCI	Sanjay Patel	2016-08-23	2	-77/+65
\| \| \| \| \| \| \|	There will only be 3 lines of code in foldICmpShrConst() when the cleanup is done, so it doesn't make much sense to have a separate function for a single fold. llvm-svn: 279575
*	[stackmaps] Extract out magic constants [NFCI]	Philip Reames	2016-08-23	2	-6/+17
\| \| \| \| \| \|	This is a first step towards clarifying the exact MI semantics of stackmap's "live values". llvm-svn: 279574
*	MachineFunction: Introduce NoPHIs property	Matthias Braun	2016-08-23	9	-4/+50
\| \| \| \| \| \| \| \| \| \| \| \| \|	I want to compute the SSA property of .mir files automatically in upcoming patches. The problem with this is that some inputs will be reported as static single assignment with some passes claiming not to support SSA form. In reality though those passes do not support PHI instructions => Track the presence of PHI instructions separate from the SSA property. Differential Revision: https://reviews.llvm.org/D22719 llvm-svn: 279573
*	[InstCombine] remove icmp shr folds that are already handled by InstSimplify	Sanjay Patel	2016-08-23	1	-17/+3
\| \| \| \| \| \| \| \|	AFAICT, these already worked in all cases for scalar types, and I enhanced the code to work for vector types in: https://reviews.llvm.org/rL279543 llvm-svn: 279568
*	GlobalISel: make truncate/extend casts uniform	Tim Northover	2016-08-23	3	-21/+46
\| \| \| \| \| \| \|	They really should have both types represented, but early variants were created before MachineInstrs could have multiple types so they're rather ambiguous. llvm-svn: 279567
*	GlobalISel: legalize integer comparisons on AArch64.	Tim Northover	2016-08-23	4	-3/+58
\| \| \| \| \| \| \|	Next step is doing both legalizations at the same time! Marvel at GlobalISel's cunning. llvm-svn: 279566
*	GlobalISel: legalize conditional branches on AArch64.	Tim Northover	2016-08-23	3	-0/+18
\| \| \| \|	llvm-svn: 279565
*	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses	Matthias Braun	2016-08-23	17	-103/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Re-apply this commit with the deletion of a MachineFunction delegated to a separate pass to avoid use after free when doing this directly in AsmPrinter. This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279564
*	[ValueTracking] Use a function_ref to avoid multiple instantiations	David Majnemer	2016-08-23	1	-5/+5
\| \| \| \| \| \| \|	No functional change intended, this should just be a code size improvement. llvm-svn: 279563
*	[SLP] Avoid signed integer overflow	Matthew Simpson	2016-08-23	1	-9/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The test case included with r279125 exposed an existing signed integer overflow. Since getTreeCost can return INT_MAX, we can't sum this cost together with other costs, such as getReductionCost. This patch removes the possibility of assigning a cost of INT_MAX. Since we were previously using INT_MAX as an indicator for "should not vectorize", we now explicitly check this condition with "isTreeTinyAndNotFullyVectorizable" before computing a cost. This patch adds a run-line to the test case used for r279125 that ensures we don't vectorize. Previously, this line would vectorize the test case by chance due to undefined behavior in the cost calculation. Differential Revision: https://reviews.llvm.org/D23723 llvm-svn: 279562
*	Remove unused translation unit.	Zachary Turner	2016-08-23	2	-14/+0
\| \| \| \|	llvm-svn: 279561
*	GlobalISel: extend legalizer interface to handle multiple types.	Tim Northover	2016-08-23	3	-44/+66
\| \| \| \| \| \| \| \|	Instructions like G_ICMP have multiple types that may need to be legalized (the boolean output and nearly arbitrary inputs in this case). So the legalizer must be capable of deciding what to do for each of them separately. llvm-svn: 279554
*	GlobalISel: mark pointer casts legal on AArch64.	Tim Northover	2016-08-23	1	-0/+3
\| \| \| \|	llvm-svn: 279553
*	Stop always creating and running an LTO compilation if there is not a single ↵	Mehdi Amini	2016-08-23	1	-21/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LTO object Summary: I assume there was a use case, so maybe this strawman patch will help clarifying if it is legit. In any case the current situation is not legit: a ThinLTO compilation should not trigger an unexpected full LTO compilation. Right now, adding a --save-temps option triggers this and makes the number of output differs. Reviewers: tejohnson Subscribers: pcc, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D23600 llvm-svn: 279550
*	GlobalISel: legalize 1-bit load/store and mark 8/16 bit variants legal on ↵	Tim Northover	2016-08-23	2	-7/+29
\| \| \| \| \| \|	AArch64. llvm-svn: 279548
*	[InstSimplify] allow icmp with constant folds for splat vectors, part 2	Sanjay Patel	2016-08-23	1	-83/+77
\| \| \| \| \| \| \| \| \| \| \| \|	Completes the m_APInt changes for simplifyICmpWithConstant(). Other commits in this series: https://reviews.llvm.org/rL279492 https://reviews.llvm.org/rL279530 https://reviews.llvm.org/rL279534 https://reviews.llvm.org/rL279538 llvm-svn: 279543
*	Possible fix of test failures on win bots	Xinliang David Li	2016-08-23	1	-3/+3
\| \| \| \|	llvm-svn: 279542
*	[InstSimplify] allow icmp with constant folds for splat vectors, part 1	Sanjay Patel	2016-08-23	1	-6/+10
\| \| \| \|	llvm-svn: 279538
*	[SelectionDAG] Use a union of bitfield structs for SDNode::SubclassData.	Justin Lebar	2016-08-23	1	-43/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This greatly simplifies our handling of SDNode::SubclassData. NFC, hopefully. :) See discussion in D23035 for discussion about the design API of these bitfields. Reviewers: chandlerc Subscribers: llvm-commits, rnk Differential Revision: https://reviews.llvm.org/D23036 llvm-svn: 279537
*	[CodeGen] Convert a loop to a for-each loop. NFC	Justin Lebar	2016-08-23	1	-7/+5
\| \| \| \|	llvm-svn: 279536
*	Fix some Clang-tidy modernize-use-using and Include What You Use warnings; ↵	Eugene Zelenko	2016-08-23	13	-93/+156
\| \| \| \| \| \| \| \|	other minor fixes. Differential revision: https://reviews.llvm.org/D23789 llvm-svn: 279535
*	[ThinLTO] Make sure the Context used for the ThinLTO backend has all the ↵	Mehdi Amini	2016-08-23	2	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	appropriate options An important performance setting on the LLVMContext for LTO is enableDebugTypeODRUniquing(), this adds an automatic merging of debug information in the context based on type ids. Also, the lto::Config includes a diagnostic handler that needs to be set on the Context, as well as the setDiscardValueNames() setting. llvm-svn: 279532
*	Fix some more asserts after r279466.	Pete Cooper	2016-08-23	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That commit added a new version of Intrinsic::getName which should only be called when the intrinsic has no overloaded types. There are several debugging paths, such as SDNode::dump which are printing the name of the intrinsic but don't have the overloaded types. These paths should be ok to just print the name instead of crashing. The fix here is ultimately to just add a 'None' second argument as that calls the overload capable getName, which is less efficient, but this is a debugging path anyway, and not perf critical. Thanks to Björn Pettersson for pointing out that there were more crashes. llvm-svn: 279528
*	[Hexagon] Packetize return value setup with the return instruction	Krzysztof Parzyszek	2016-08-23	1	-3/+4
\| \| \| \| \| \|	Commit r279241 unintentionally reverted that ability. llvm-svn: 279526
*	[Profile] refactor meta data copying/swapping code	Xinliang David Li	2016-08-23	3	-57/+52
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D23619 llvm-svn: 279523
*	[lanai] Use const instead of constexpr	Jacques Pienaar	2016-08-23	1	-2/+2
\| \| \| \| \| \|	The windows build bot did not like constexpr. llvm-svn: 279517
*	Fix SystemZ hang caused by r279105	Elliot Colp	2016-08-23	2	-29/+55
\| \| \| \| \| \| \| \| \|	The change in r279105 causes an infinite loop in some cases, as it sets the upper bits of an AND mask constant, which DAGCombiner::SimplifyDemandedBits then unsets. This patch reverts that part of the behaviour, instead relying on .td peepholes to perform the transformation to NILL. I reapplied my original fix for the problem addressed by r279105 (unsetting the upper bits, which prevents a compiler abort for a different reason). Differential Revision: https://reviews.llvm.org/D23781 llvm-svn: 279515
*	[LTOCodeGenerator] Reduce code duplication. NFCI.	Davide Italiano	2016-08-23	1	-8/+8
\| \| \| \|	llvm-svn: 279514
*	LLVMLanaDesc: Update libdesp.	NAKAMURA Takumi	2016-08-23	1	-1/+1
\| \| \| \|	llvm-svn: 279510
*	Change the target's name, s/LanaiMCTargetDesc/LanaiDesc/g.	NAKAMURA Takumi	2016-08-23	5	-5/+5
\| \| \| \| \| \|	"AllTargetsDescs" in llvm-mc/CMakeLists.txt expects not ${target}MCTargetDesc, but ${target}Desc. llvm-svn: 279509
*	[ARM] Generate consistent frame records for Thumb2	Oliver Stannard	2016-08-23	4	-33/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is not an official documented ABI for frame pointers in Thumb2, but we should try to emit something which is useful. We use r7 as the frame pointer for Thumb code, which currently means that if a function needs to save a high register (r8-r11), it will get pushed to the stack between the frame pointer (r7) and link register (r14). This means that while a stack unwinder can follow the chain of frame pointers up the stack, it cannot know the offset to lr, so does not know which functions correspond to the stack frames. To fix this, we need to push the callee-saved registers in two batches, with the first push saving the low registers, fp and lr, and the second push saving the high registers. This is already implemented, but previously only used for iOS. This patch turns it on for all Thumb2 targets when frame pointers are required by the ABI, and the frame pointer is r7 (Windows uses r11, so this isn't a problem there). If frame pointer elimination is enabled we still emit a single push/pop even if we need a frame pointer for other reasons, to avoid increasing code size. We must also ensure that lr is pushed to the stack when using a frame pointer, so that we end up with a complete frame record. Situations that could cause this were rare, because we already push lr in most situations so that we can return using the pop instruction. Differential Revision: https://reviews.llvm.org/D23516 llvm-svn: 279506
*	GVNHoist: Use the pass version of MemorySSA and preserve it.	Daniel Berlin	2016-08-23	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: GVNHoist: Use the pass version of MemorySSA and preserve it. Reviewers: sebpop, george.burgess.iv Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23782 llvm-svn: 279504
*	Revert "(HEAD -> master, origin/master, origin/HEAD) CodeGen: Remove ↵	Matthias Braun	2016-08-23	17	-68/+101
\| \| \| \| \| \| \| \| \| \|	MachineFunctionAnalysis => Enable (Machine)ModulePasses" Reverting while tracking down a use after free. This reverts commit r279502. llvm-svn: 279503
*	CodeGen: Remove MachineFunctionAnalysis => Enable (Machine)ModulePasses	Matthias Braun	2016-08-23	17	-101/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes the MachineFunctionAnalysis. Instead we keep a map from IR Function to MachineFunction in the MachineModuleInfo. This allows the insertion of ModulePasses into the codegen pipeline without breaking it because the MachineFunctionAnalysis gets dropped before a module pass. Peak memory should stay unchanged without a ModulePass in the codegen pipeline: Previously the MachineFunction was freed at the end of a codegen function pipeline because the MachineFunctionAnalysis was dropped; With this patch the MachineFunction is freed after the AsmPrinter has finished. Differential Revision: http://reviews.llvm.org/D23736 llvm-svn: 279502
*	BranchRelaxation: Fix handling of blocks with multiple conditional	Matt Arsenault	2016-08-23	1	-6/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	branches Looping over all terminators exposed AArch64 tests hitting an assert from analyzeBranch failing. I believe these cases were miscompiled before. e.g. fcmp s0, s1 b.ne LBB0_1 b.vc LBB0_2 b LBB0_2 LBB0_1: ; Large block LBB0_2: ; ... Both of the individual conditional branches need to be expanded, since neither can reach the final block. Split the original block into ones which analyzeBranch will be able to understand. llvm-svn: 279499
*	[lanai] Exit early in Mem Alu combiner if sentinel reach.	Jacques Pienaar	2016-08-23	1	-0/+3
\| \| \| \| \| \|	LanaiMemAluCombiner could try to query the debug value of a list sentinel. Add check to exit early instead. llvm-svn: 279497
*	[MemorySSA] Remove unused field. NFC.	George Burgess IV	2016-08-22	1	-6/+1
\| \| \| \| \| \| \| \|	Given that we're not currently using blocker info, and whether or not we will end up using it it is unclear, don't waste 8 (or 4) bytes of memory per path node. llvm-svn: 279493
*	[InstSimplify] add helper function for SimplifyICmpInst(); NFCI	Sanjay Patel	2016-08-22	1	-133/+143
\| \| \| \| \| \| \| \| \|	And add a FIXME because the helper excludes folds for vectors. It's not clear yet how many of these are actually testable (and therefore necessary?) because later analysis uses computeKnownBits and other methods to catch many of these cases. llvm-svn: 279492
*	Fix crash from assert in r279466.	Pete Cooper	2016-08-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The assert in r279466 checks that we call the correct version of Intrinsic::getName. The version which accepts only an ID should not be used for intrinsics with overloaded types. The global-isel code was calling the wrong version. The test CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll will ensure that we call the correct version from now on. llvm-svn: 279487
*	[InstCombine] change param type from Instruction to BinaryOperator for icmp ↵	Sanjay Patel	2016-08-22	2	-97/+109
\| \| \| \| \| \| \| \|	helpers; NFCI This saves some casting in the helper functions and eases some further refactoring. llvm-svn: 279478
*	[GraphTraits] Replace all NodeType usage with NodeRef	Tim Shen	2016-08-22	4	-28/+13
\| \| \| \| \| \| \| \|	This should finish the GraphTraits migration. Differential Revision: http://reviews.llvm.org/D23730 llvm-svn: 279475
*	ADT: Remove ilist_*sentinel_traits, NFC	Duncan P. N. Exon Smith	2016-08-22	2	-8/+0
\| \| \| \| \| \| \| \| \| \|	Remove all the dead code around ilist_*sentinel_traits. This is a follow-up to gutting them as part of r279314 (originally r278974), staged to prevent broken builds in sub-projects. Uses were removed from clang in r279457 and lld in r279458. llvm-svn: 279473
*	[InstCombine] use m_APInt to allow icmp (shr exact X, Y), 0 folds for splat ↵	Sanjay Patel	2016-08-22	1	-14/+13
\| \| \| \| \| \|	constant vectors llvm-svn: 279472
*	Add ADT headers to the cmake headers directory for LLVMSupport. NFC.	Pete Cooper	2016-08-22	1	-0/+1
\| \| \| \| \| \| \| \| \|	Xcode and MSVC list the headers and source files for each library. LLVMSupport lists included the source files for ADT but not the headers. This add the ADT headers so that they are browsable by the UI. llvm-svn: 279470
*	Add comments and an assert to follow-up on r279113. NFC.	Pete Cooper	2016-08-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Philip commented on r279113 to ask for better comments as to when to use the different versions of getName. Its also possible to assert in the simple case that we aren't an overloaded intrinsic as those have to use the more capable version of getName. Thanks for the comments Philip. llvm-svn: 279466
*	AMDGPU: Split SILowerControlFlow into two pieces	Matt Arsenault	2016-08-22	6	-353/+518
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do most of the lowering in a pre-RA pass. Keep the skip jump insertion late, plus a few other things that require more work to move out. One concern I have is now there may be COPY instructions which do not have the necessary implicit exec uses if they will be lowered to v_mov_b32. This has a positive effect on SGPR usage in shader-db. llvm-svn: 279464
*	MSSA: Factor out phi node placement	Daniel Berlin	2016-08-22	1	-17/+22
\| \| \| \|	llvm-svn: 279462
*	MSSA: Only rename accesses whose defining access is nullptr	Daniel Berlin	2016-08-22	1	-14/+6
\| \| \| \|	llvm-svn: 279461
*	[SimplifyCFG] Rewrite SinkThenElseCodeToEnd	James Molloy	2016-08-22	1	-150/+236
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[Recommitting now an unrelated assertion in SROA is sorted out] The new version has several advantages: 1) IMSHO it's more readable and neater 2) It handles loads and stores properly 3) It can handle any number of incoming blocks rather than just two. I'll be taking advantage of this in a followup patch. With this change we can now finally sink load-modify-store idioms such as: if (a) return b += 3; else return b += 4; => %z = load i32, i32* %y %.sink = select i1 %a, i32 5, i32 7 %b = add i32 %z, %.sink store i32 %b, i32* %y ret i32 %b When this works for switches it'll be even more powerful. Round 4. This time we should handle all instructions correctly, and not replace any operands that need to be constant with variables. This was really hard to determine safely, so the helper function should be put into the Instruction API. I'll do that as a followup. llvm-svn: 279460