bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[codegen] Add generic functions to skip debug values.	Florian Hahn	2016-12-16	5	-75/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This commits moves skipDebugInstructionsForward and skipDebugInstructionsBackward from lib/CodeGen/IfConversion.cpp to include/llvm/CodeGen/MachineBasicBlock.h and updates some codgen files to use them. This refactoring was suggested in https://reviews.llvm.org/D27688 and I thought it's best to do the refactoring in a separate review, but I could also put both changes in a single review if that's preferred. Also, the names for the functions aren't the snappiest and I would be happy to rename them if anybody has suggestions. Reviewers: eli.friedman, iteratee, aprantl, MatzeB Subscribers: MatzeB, llvm-commits Differential Revision: https://reviews.llvm.org/D27782 llvm-svn: 289933
*	[ARM] Expose methods to get the CCAssignFn. NFCI	Diana Picus	2016-12-16	2	-17/+21
\| \| \| \| \| \| \| \| \| \| \|	Add two public methods to ARMTargetLowering: CCAssignFnForCall and CCAssignFnForReturn, which are just calling the already existing private method CCAssignFnForNode. These will come in handy for GlobalISel on ARM. We also replace all calls to CCAssignFnForNode in ARMISelLowering.cpp, because the new methods are friendlier to the reader. llvm-svn: 289932
*	Revert r289638: [PowerPC] Fix logic dealing with nop after calls (and ↵	Chandler Carruth	2016-12-16	1	-25/+40
\| \| \| \| \| \| \| \| \| \| \| \| \|	tail-call eligibility) This patch appears to result in trampolines in vtables being miscompiled when they in turn tail call a method. I've posted some preliminary details about the failure on the thread for this commit and talked to Hal. He was comfortable going ahead and reverting until we sort out what is wrong. llvm-svn: 289928
*	Extract a TBAAVerifier out of the verifier (NFC)	Mehdi Amini	2016-12-16	1	-268/+268
\| \| \| \| \| \| \| \| \| \| \|	This is intended to be used (in a later patch) by the BitcodeReader to detect invalid TBAA and drop them when loading bitcode, so that we don't break client that have legacy bitcode with possible invalid TBAA. Differential Revision: https://reviews.llvm.org/D27838 llvm-svn: 289927
*	attempt to fix windows build	Nico Weber	2016-12-16	1	-1/+2
\| \| \| \|	llvm-svn: 289926
*	Update .debug_line section version information to match DWARF version.	Ekaterina Romanova	2016-12-16	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \|	One more attempt to re-commit the patch r285355, which I had to revert in r285362, because some tests were failing (the reason is because the size of the line_table varied depending on the full file name). In the past the compiler always emitted .debug_line version 2, though some opcodes from DWARF 3 (e.g. DW_LNS_set_prologue_end, DW_LNS_set_epilogue_begin or DW_LNS_set_isa) and from DWARF 4 could be emitted by the compiler. This patch changes version information of .debug_line to exactly match the DWARF version. For .debug_line version 4, a new field maximum_operations_per_instruction is emitted. Differential Revision: https://reviews.llvm.org/D16697 llvm-svn: 289925
*	Revert 279703, it caused PR31404.	Nico Weber	2016-12-16	3	-92/+19
\| \| \| \|	llvm-svn: 289923
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-16	18	-151/+312
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. This reapplies r289902 with additional testcase upgrades. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289920
*	[ThinLTO] Thin link efficiency: More efficient export list computation	Teresa Johnson	2016-12-16	1	-32/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Instead of checking whether a global referenced by a function being imported is defined in the same module, speculatively always add the referenced globals to the module's export list. After all imports are computed, for each module prune any not in its defined set from its export list. For a huge C++ app with aggressive importing thresholds, even with D27687 we spent a lot of time invoking modulePath() from exportGlobalInModule (modulePath() was still the 2nd hottest routine in profile). The reason is that with comdat/linkonce the summary lists for each GUID can be long. For the app in question, for example, we were invoking exportGlobalInModule almost 2 million times, and we traversed an average of 63 entries in the summary list each time. This patch reduced the thin link time for the app by about 10% (on top of D27687) when using aggressive importing thresholds, and about 3.5% on average with default importing thresholds. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27755 llvm-svn: 289918
*	Add extra headers that got deleted by my revert in r289916 but for which	Chandler Carruth	2016-12-16	1	-1/+2
\| \| \| \| \| \|	new usage had already grown in the file. llvm-svn: 289917
*	Revert patch series introducing the DAG combine to match a load-by-bytes	Chandler Carruth	2016-12-16	1	-283/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	idiom. r289538: Match load by bytes idiom and fold it into a single load r289540: Fix a buildbot failure introduced by r289538 r289545: Use more detailed assertion messages in the code ... r289646: Add a couple of assertions to the load combine code ... This DAG combine has a bad crash in it that is quite hard to trigger sadly -- it relies on sneaking code with UB through the SDAG build and into this particular combine. I've responded to the original commit with a test case that reproduces it. However, the code also has other problems that will require substantial changes to address and so I'm going ahead and reverting it for now. This should unblock us and perhaps others that are hitting the crash in the wild and will let a fresh patch with updated approach come in cleanly afterward. Sorry for any trouble or disruption! llvm-svn: 289916
*	[SimplifyLibCalls] Use a lambda. NFCI.	Davide Italiano	2016-12-16	1	-6/+6
\| \| \| \|	llvm-svn: 289911
*	[Hexagon] Fix some Clang-tidy modernize and Include What You Use warnings; ↵	Eugene Zelenko	2016-12-16	6	-151/+235
\| \| \| \| \| \|	other minor fixes (NFC). llvm-svn: 289907
*	Revert "[IR] Remove the DIExpression field from DIGlobalVariable."	Adrian Prantl	2016-12-16	18	-313/+151
\| \| \| \| \| \|	This reverts commit 289902 while investigating bot berakage. llvm-svn: 289906
*	Add missing library dep.	Peter Collingbourne	2016-12-16	1	-1/+1
\| \| \| \|	llvm-svn: 289903
*	[IR] Remove the DIExpression field from DIGlobalVariable.	Adrian Prantl	2016-12-16	18	-151/+313
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements PR31013 by introducing a DIGlobalVariableExpression that holds a pair of DIGlobalVariable and DIExpression. Currently, DIGlobalVariables holds a DIExpression. This is not the best way to model this: (1) The DIGlobalVariable should describe the source level variable, not how to get to its location. (2) It makes it unsafe/hard to update the expressions when we call replaceExpression on the DIGLobalVariable. (3) It makes it impossible to represent a global variable that is in more than one location (e.g., a variable with multiple DW_OP_LLVM_fragment-s). We also moved away from attaching the DIExpression to DILocalVariable for the same reasons. <rdar://problem/29250149> https://llvm.org/bugs/show_bug.cgi?id=31013 Differential Revision: https://reviews.llvm.org/D26769 llvm-svn: 289902
*	IPO: Introduce ThinLTOBitcodeWriter pass.	Peter Collingbourne	2016-12-16	2	-0/+345
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass prepares a module containing type metadata for ThinLTO by splitting it into regular and thin LTO parts if possible, and writing both parts to a multi-module bitcode file. Modules that do not contain type metadata are written unmodified as a single module. All globals with type metadata are added to the regular LTO module, and the rest are added to the thin LTO module. Differential Revision: https://reviews.llvm.org/D27324 llvm-svn: 289899
*	[AArch64] Add FeatureSlowMisaligned128Store to Exynos M1 and M2	Evandro Menezes	2016-12-16	1	-0/+2
\| \| \| \| \| \| \|	This feature now gates such stores after r289845. Thus the Exynos processors now need this feature. llvm-svn: 289898
*	[ThinLTO] Thin link efficiency improvement: don't re-export globals (NFC)	Teresa Johnson	2016-12-15	1	-9/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We were reinvoking exportGlobalInModule numerous times redundantly. No need to re-export globals referenced by a global that was already imported from its module. This resulted in a large speedup in the thin link for a big application, particularly when importing aggressiveness was cranked up. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27687 llvm-svn: 289896
*	[SimplifyLibCalls] Lower fls() to llvm.ctlz().	Davide Italiano	2016-12-15	2	-3/+19
\| \| \| \| \| \|	Differential Revision: https://reviews.llvm.org/D14590 llvm-svn: 289894
*	DebugInfo: Address non-deterministic output (iterating a SmallPtrSet) in 289697	David Blaikie	2016-12-15	3	-9/+5
\| \| \| \| \| \| \| \|	Post-commit review feedback from Adrian Prantl. Hopefully this fixes that up :) llvm-svn: 289892
*	[IRTranslator] Merge the entry and ABI lowering blocks.	Quentin Colombet	2016-12-15	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The IRTranslator uses an additional block before the LLVM-IR entry block to perform all the ABI lowering and the constant hoisting. Thus, this block is the actual entry block and it falls through the LLVM-IR entry block. However, with such representation, we end up with two basic blocks that are not maximal. Therefore, this patch adds a bit of canonicalization by merging both the LLVM-IR entry block and the ABI lowering/constants hoisting into one block, making the resulting block more likely to be maximal (indeed the LLVM-IR entry block might not have been maximal). llvm-svn: 289891
*	DebugInfo: Emit ranges for functions with DISubprograms but lacking ↵	David Blaikie	2016-12-15	3	-29/+20
\| \| \| \| \| \| \| \| \|	locations on any instructions This seems more consistent, and helps tidy up/simplify some other code in this change. llvm-svn: 289889
*	[SimplifyLibCalls] Remove redundant folding logic for ffs().	Davide Italiano	2016-12-15	1	-13/+3
\| \| \| \| \| \| \| \|	Lowering to llvm.cttz() will result in constant folding anyway if the argument to ffs is a constant. Pointed out by Eli for fls() in D14590. llvm-svn: 289888
*	Don't combine splats with other shuffles.	Eli Friedman	2016-12-15	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	We sometimes end up creating shuffles which are worse than the obvious translation of the IR. Fixes https://llvm.org/bugs/show_bug.cgi?id=31301 . Differential Revision: https://reviews.llvm.org/D27793 llvm-svn: 289882
*	Fix R_AARCH64_MOVW_UABS_G3 relocation	Yichao Yu	2016-12-15	1	-23/+49
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The relocation is missing mask so an address that has non-zero bits in 47:43 may overwrite the register number. (Frequently shows up as target register changed to `xzr`....) Reviewers: t.p.northover, lhames Subscribers: davide, aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D27609 llvm-svn: 289880
*	AMDGPU: Select branch on undef to uniform scc branch	Matt Arsenault	2016-12-15	3	-0/+21
\| \| \| \|	llvm-svn: 289877
*	[ThinLTO] Revert part of r289843 that belonged to another patch.	Teresa Johnson	2016-12-15	1	-13/+9
\| \| \| \| \| \| \| \|	The code change for D27687 accidentally got committed along with the main change in r289843. Revert it temporarily, so that I can recommit it along with its test as intended. llvm-svn: 289875
*	Don't combine a shuffle of two BUILD_VECTORs with duplicate elements.	Eli Friedman	2016-12-15	1	-10/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	Targets can't handle this case well in general; we often transform a shuffle of two cheap BUILD_VECTORs to element-by-element insertion, which is very inefficient. Fixes https://llvm.org/bugs/show_bug.cgi?id=31364 . Partially fixes https://llvm.org/bugs/show_bug.cgi?id=31301. Differential Revision: https://reviews.llvm.org/D27787 llvm-svn: 289874
*	[Verifier] Allow TBAA metadata on atomicrmw and atomiccmpxchg	Sanjoy Das	2016-12-15	1	-1/+2
\| \| \| \| \| \| \| \| \|	This used to be allowed before r289402 by default (before r289402 you could have TBAA metadata on any instruction), and while I'm not sure that it helps, it does sound reasonable enough to not fail the verifier and we have out-of-tree users who use this. llvm-svn: 289872
*	[ThinLTO] Remove stale comment (NFC)	Teresa Johnson	2016-12-15	1	-2/+1
\| \| \| \| \| \|	This should have been removed with r288446. llvm-svn: 289871
*	AMDGPU: Fix asserting on returned tail calls	Matt Arsenault	2016-12-15	1	-2/+4
\| \| \| \|	llvm-svn: 289868
*	[ThinLTO] Thin link efficiency: skip candidate added later with higher ↵	Teresa Johnson	2016-12-15	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	threshold (NFC) Summary: Thin link efficiency improvement. After adding an importing candidate to the worklist we might have later added it again with a higher threshold. Skip it when popped from the worklist if we recorded a higher threshold than the current worklist entry, it will get processed again at the higher threshold when that entry is popped. This required adding the summary's GUID to the worklist, so that it can be used to query the recorded highest threshold for it when we pop from the worklist. Reviewers: mehdi_amini Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D27696 llvm-svn: 289867
*	AMDGPU: Assembler support for vintrp instructions	Matt Arsenault	2016-12-15	3	-6/+108
\| \| \| \|	llvm-svn: 289866
*	[LV] Enable vectorization of loops with conditional stores by default	Matthew Simpson	2016-12-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	This patch sets the default value of the "-enable-cond-stores-vec" command line option to "true". Differential Revision: https://reviews.llvm.org/D27814 llvm-svn: 289863
*	[SimplifyCFG] Merge debug locations when hoisting an instruction from a ↵	Andrea Di Biagio	2016-12-15	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	then/else branch. NFC. Now that a new API to merge debug locations has been committed at r289661 (see review D26256 for more details), we can use it to "improve" the code added by revision r280995. Instead of nulling the debugloc of a commoned instruction, we use the 'merged' debug location. At the moment, this is just a no functional change since function `DILocation::getMergedLocation()` is just a stub and would always return a null location. Differential Revision: https://reviews.llvm.org/D27804 llvm-svn: 289862
*	[LiveRangeEdit] Change eliminateDeadDef assert to if condition.	Geoff Berry	2016-12-15	1	-4/+5
\| \| \| \| \| \| \| \| \| \|	The assert could potentially fire (though no cases have been encountered), so just check that the instruction we're handling specially for rematerialization only has one def to begin with. Reviewed by Wei Mi over email. llvm-svn: 289861
*	LibDriver: Allow resource files to be archive members.	Peter Collingbourne	2016-12-15	1	-2/+4
\| \| \| \| \| \| \| \|	It seems pointless to add a resource to an archive because it won't have any symbols to link against (and link.exe doesn't have an equivalent of --whole-archive), but lib.exe allows it for some reason. llvm-svn: 289859
*	[InstCombine] add folds for icmp (smin X, Y), X	Sanjay Patel	2016-12-15	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Min/max canonicalization (r287585) exposes the fact that we're missing combines for min/max patterns. This patch won't solve the example that was attached to that thread, so something else still needs fixing. The line between InstCombine and InstSimplify gets blurry here because sometimes the icmp instruction that we want to fold to already exists, but sometimes it's the swapped form of what we want. Corresponding changes for smax/umin/umax to follow. Differential Revision: https://reviews.llvm.org/D27531 llvm-svn: 289855
*	[GlobalISel] Drop workaround for Legalizer member/class sharing a name. NFC.	Ahmed Bougacha	2016-12-15	3	-3/+3
\| \| \| \| \| \| \| \|	MachineLegalizer used to be the name of both the class and the member, causing GCC errors. r276522 fixed that by renaming the member to just 'Legalizer'. The 'class' workaround isn't necessary anymore; drop it. llvm-svn: 289848
*	[x86] use a single shufps for 256-bit vectors when it can save instructions	Sanjay Patel	2016-12-15	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \|	This is the 256-bit counterpart to the 128-bit transform checked in here: https://reviews.llvm.org/rL289837 This patch is based on the draft by @sroland (Roland Scheidegger) that is attached to PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 llvm-svn: 289846
*	[AArch64] Guard Misaligned 128-bit store penalty by subtarget feature	Matthew Simpson	2016-12-15	1	-1/+2
\| \| \| \| \| \| \| \| \|	This patch checks that the SlowMisaligned128Store subtarget feature is set when penalizing such stores in getMemoryOpCost. Differential Revision: https://reviews.llvm.org/D27677 llvm-svn: 289845
*	[AArch64][GlobalISel] Remove redundant RBI comments. NFC.	Ahmed Bougacha	2016-12-15	1	-20/+1
\| \| \| \| \| \| \|	It's brittle, and Doxygen already picks the overriden method's comment anyway. llvm-svn: 289844
*	[ThinLTO] Ensure callees get hot threshold when first seen on cold path	Teresa Johnson	2016-12-15	1	-24/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is split out from D27696, since it turned out to be a bug fix and not part of the NFC efficiency change. Keep the same adjusted (possibly decayed) threshold in both the worklist and the ImportList. Otherwise if we encountered it first along a cold path, the callee would be added to the worklist with a lower decayed threshold than when it is later encountered along a hot path. But the logic uses the threshold recorded in the ImportList entry to check if we should re-add it, and without this patch the threshold recorded there is the same along both paths so we don't re-add it. Using the same possibly decayed threshold in the ImportList ensures we re-add it later with the higher non-decayed hot path threshold. llvm-svn: 289843
*	[x86] use a single shufps when it can save instructions	Sanjay Patel	2016-12-15	1	-14/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a tiny patch with a big pile of test changes. This partially fixes PR27885: https://llvm.org/bugs/show_bug.cgi?id=27885 My motivating case looks like this: - vpshufd {{.#+}} xmm1 = xmm1[0,1,0,2] - vpshufd {{.#+}} xmm0 = xmm0[0,2,2,3] - vpblendw {{.#+}} xmm0 = xmm0[0,1,2,3],xmm1[4,5,6,7] + vshufps {{.#+}} xmm0 = xmm0[0,2],xmm1[0,2] And this happens several times in the diffs. For chips with domain-crossing penalties, the instruction count and size reduction should usually overcome any potential domain-crossing penalty due to using an FP op in a sequence of int ops. For chips such as recent Intel big cores and Atom, there is no domain-crossing penalty for shufps, so using shufps is a pure win. So the test case diffs all appear to be improvements except one test in vector-shuffle-combining.ll where we miss an opportunity to use a shift to generate zero elements and one test in combine-sra.ll where multiple uses prevent the expected shuffle combining. Differential Revision: https://reviews.llvm.org/D27692 llvm-svn: 289837
*	[X86][SSE] Fix domains for scalar store instructions	Simon Pilgrim	2016-12-15	1	-0/+4
\| \| \| \| \| \|	As discussed on D27692 llvm-svn: 289834
*	Revert "[SimplifyCFG] In sinkLastInstruction correctly set debugloc of ↵	Robert Lougher	2016-12-15	1	-8/+1
\| \| \| \| \| \| \| \|	common inst" Reverting as it is causing buildbot failures (address sanitizer). llvm-svn: 289833
*	[lanai] Simplify small section check in LowerGlobalAddress and treat ldata ↵	Jacques Pienaar	2016-12-15	2	-3/+14
\| \| \| \| \| \| \| \|	sections specially. Move the check for the code model into isGlobalInSmallSectionImpl and return false (not in small section) for variables placed in sections prefixed with .ldata (workaround for a tool limitation). llvm-svn: 289832
*	[X86][AVX512] Moved instruction domain lookups to the right table. NFCI.	Simon Pilgrim	2016-12-15	1	-4/+4
\| \| \| \| \| \|	Avoid duplicating instructions in the int32/int64 domains. llvm-svn: 289830
*	[SimplifyCFG] In sinkLastInstruction correctly set debugloc of "common" inst	Robert Lougher	2016-12-15	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \|	Simplify CFG will try to sink the last instruction in a series of basic blocks, creating a "common" instruction in the successor block (sinkLastInstruction). When it does this, the debug location of the single instruction should be the merged debug locations of the commoned instructions. Differential Revision: https://reviews.llvm.org/D27590 llvm-svn: 289828