bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[GlobalOpt] Allow constant globals to be SRA'd	James Molloy	2016-04-25	1	-0/+21
\| \| \| \| \| \| \| \|	The current logic assumes that any constant global will never be SRA'd. I presume this is because normally constant globals can be pushed into their uses and deleted. However, that sometimes can't happen (which is where you really want SRA, so the elements that can be eliminated, are!). There seems to be no reason why we can't SRA constants too, so let's do it. llvm-svn: 267393
*	[Coverage] Restore the correct count value after processing a nested region ↵	Igor Kudrin	2016-04-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	in case of combined regions. If several regions cover the same area of code, we have to restore the combined value for that area when return from a nested region. This patch achieves that by combining regions before calling buildSegments. Differential Revision: http://reviews.llvm.org/D18610 llvm-svn: 267390
*	[SCEV] Improve the run-time checking of the NoWrap predicate	Silviu Baranga	2016-04-25	1	-12/+127
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This implements a new method of run-time checking the NoWrap SCEV predicates, which should be easier to optimize and nicer for targets that don't correctly handle multiplication/addition of large integer types (like i128). If the AddRec is {a,+,b} and the backedge taken count is c, the idea is to check that \|b\| * c doesn't have unsigned overflow, and depending on the sign of b, that: a + \|b\| * c >= a (b >= 0) or a - \|b\| * c <= a (b <= 0) where the comparisons above are signed or unsigned, depending on the flag that we're checking. The advantage of doing this is that we avoid extending to a larger type and we avoid the multiplication of large types (multiplying i128 can be expensive). Reviewers: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D19266 llvm-svn: 267389
*	[PowerPC] [PR27387] Disallow r0 for ADD8TLS.	Marcin Koscielnicki	2016-04-25	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \|	ADD8TLS, a variant of add instruction used for initial-exec TLS, currently accepts r0 as a source register. While add itself supports r0 just fine, linker can relax it to a local-exec sequence, converting it to addi - which doesn't support r0. Differential Revision: http://reviews.llvm.org/D19193 llvm-svn: 267388
*	Fixing wrong mask size error. From __mmask8 to __mmask16.	Michael Zuckerman	2016-04-25	1	-5/+5
\| \| \| \| \| \| \|	Was reviewed over the shoulder by AsafBadouh. Connected to review http://reviews.llvm.org/D19195. llvm-svn: 267379
*	[X86] Add a complete set of tests for all operand sizes of cttz/ctlz with ↵	Craig Topper	2016-04-25	1	-6/+123
\| \| \| \| \| \|	and without zero undef being lowered to bsf/bsr. llvm-svn: 267373
*	Verifier: Verify that each inlinable callsite of a debug-info-bearing function	Adrian Prantl	2016-04-24	3	-2/+66
\| \| \| \| \| \| \| \| \| \| \| \| \|	in a debug-info-bearing function has a debug location attached to it. Failure to do so causes an "!dbg attachment points at wrong subprogram for function" assertion failure when the inliner sets up inline scope info. rdar://problem/25878916 This reaplies r267320 without changes after fixing an issue in the OpenMP IR generator in clang. llvm-svn: 267370
*	Also check the IR.	Rafael Espindola	2016-04-24	1	-0/+4
\| \| \| \|	llvm-svn: 267367
*	Add a test for how we handle protected visibility.	Rafael Espindola	2016-04-24	2	-0/+22
\| \| \| \|	llvm-svn: 267366
*	[X86][AVX] Added PR24935 test case	Simon Pilgrim	2016-04-24	1	-0/+39
\| \| \| \|	llvm-svn: 267362
*	ARM: fix __chkstk Frame Setup on WoA	Saleem Abdulrasool	2016-04-24	4	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \|	This corrects the MI annotations for the stack adjustment following the __chkstk invocation. We were marking the original SP usage as a Def rather than Kill. The (new) assigned value is the definition, the original reference is killed. Adjust the ISelLowering to mark Kills and FrameSetup as well. This partially resolves PR27480. llvm-svn: 267361
*	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required	Simon Pilgrim	2016-04-24	2	-10/+4
\| \| \| \| \| \|	As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. llvm-svn: 267359
*	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2)	Simon Pilgrim	2016-04-24	4	-182/+60
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 llvm-svn: 267357
*	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)	Simon Pilgrim	2016-04-24	3	-145/+74
\| \| \| \| \| \| \| \| \| \| \| \|	This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input) 2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input Differential Revision: http://reviews.llvm.org/D17490 llvm-svn: 267356
*	[X86][SSE] Added SSSE3/AVX/AVX2 BITREVERSE tests	Simon Pilgrim	2016-04-24	1	-52/+14603
\| \| \| \| \| \|	Codegen is pretty bad at the moment but could use PSHUFB quite efficiently llvm-svn: 267347
*	[X86][XOP] Fixed VPPERM permute op decoding (PR27472).	Simon Pilgrim	2016-04-24	1	-1/+1
\| \| \| \| \| \|	Fixed issue with VPPERM target shuffle mask decoding that was incorrectly masking off the 3-bit permute op with a 2-bit mask. llvm-svn: 267346
*	[X86][SSE] Improved support for decoding target shuffle masks through bitcasts	Simon Pilgrim	2016-04-24	2	-13/+3
\| \| \| \| \| \| \| \|	Reused the ability to split constants of a type wider than the shuffle mask to work with masks generated from scalar constants transfered to xmm. This fixes an issue preventing PSHUFB target shuffle masks decoding rematerialized scalar constants and also exposes the XOP VPPERM bug described in PR27472. llvm-svn: 267343
*	[SystemZ] [SSP] Add support for LOAD_STACK_GUARD.	Marcin Koscielnicki	2016-04-24	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes PR22248 on s390x. The previous attempt at this was D19101, which was before LOAD_STACK_GUARD existed. Compared to the previous version, this always emits a rather ugly block of 4 instructions, involving a thread pointer load that can't be shared with other potential users. However, this is necessary for SSP - spilling the guard value (or thread pointer used to load it) is counter to the goal, since it could be overwritten along with the frame it protects. Differential Revision: http://reviews.llvm.org/D19363 llvm-svn: 267340
*	[X86][SSE] Demonstrate issue with decoding shuffle masks that have been ↵	Simon Pilgrim	2016-04-24	2	-0/+37
\| \| \| \| \| \| \| \|	lowered as rematerialized constants on scalar unit Found whilst investigating PR27472 llvm-svn: 267339
*	llvm/test/tools/gold/X86/thinlto.ll: Possible fix corresponding to r267318.	NAKAMURA Takumi	2016-04-24	1	-0/+1
\| \| \| \|	llvm-svn: 267334
*	BitcodeReader: Fix some holes in upgrade from r267296	Duncan P. N. Exon Smith	2016-04-24	2	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add tests for some missing cases to bitcode upgrade in r267296. - DICompositeType with an 'elements:' field, which will cause it to be involved in a cycle after the upgrade. - A DIDerivedType that references a class in 'extraData:'. I updated test/Bitcode/dityperefs-3.8.ll with the missing cases and regenerated test/Bitcode/dityperefs-3.8.ll.bc. llvm-svn: 267332
*	Add "hasSection" flag in the Summary	Mehdi Amini	2016-04-24	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \|	Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19405 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267329
*	[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098)	Gerolf Hoflehner	2016-04-24	2	-0/+264
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328
*	Revert "Verifier: Verify that each inlinable callsite of a ↵	Adrian Prantl	2016-04-24	3	-66/+2
\| \| \| \| \| \| \| \|	debug-info-bearing function" This reverts commit r267320 while investigating an OpenMP buildbot failure. llvm-svn: 267322
*	Verifier: Verify that each inlinable callsite of a debug-info-bearing function	Adrian Prantl	2016-04-24	3	-2/+66
\| \| \| \| \| \| \| \| \| \|	in a debug-info-bearing function has a debug location attached to it. Failure to do so causes an "!dbg attachment points at wrong subprogram for function" assertion failure when the inliner sets up inline scope info. rdar://problem/25878916 llvm-svn: 267320
*	Reorganize GlobalValueSummary with a "Flags" bitfield.	Mehdi Amini	2016-04-24	3	-19/+19
\| \| \| \| \| \| \| \| \| \|	Right now it only contains the LinkageType, but will be extended with "hasSection", "isOptSize", "hasInlineAssembly", etc. Differential Revision: http://reviews.llvm.org/D19404 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267319
*	Add a version field in the bitcode for the summary	Mehdi Amini	2016-04-24	7	-0/+21
\| \| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D19456 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267318
*	Add an internalization step to the ThinLTOCodeGenerator	Mehdi Amini	2016-04-24	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \|	Keeping as much as possible internal/private is known to help the optimizer. Let's try to benefit from this in ThinLTO. Note: this is early work, but is enough to build clang (and all the LLVM tools). I still need to write some lit-tests... Differential Revision: http://reviews.llvm.org/D19103 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267317
*	[X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt ↵	Craig Topper	2016-04-24	1	-209/+0
\| \| \| \| \| \|	instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly. llvm-svn: 267311
*	[RuntimeDyldELF] Handle GOTPCRELX/REX_GOTPCRELX.	Davide Italiano	2016-04-24	1	-0/+8
\| \| \| \|	llvm-svn: 267309
*	[MC/ELF] Make the relaxation test more interesting.	Davide Italiano	2016-04-24	1	-0/+2
\| \| \| \| \| \|	Add a case where we can't relax. llvm-svn: 267308
*	[MC/ELF] Implement support for GOTPCRELX/REX_GOTPCRELX.	Davide Italiano	2016-04-24	1	-0/+18
\| \| \| \| \| \| \| \| \|	The option to control the emission of the new relocations is -relax-relocations (blatantly copied from GNU as). It can't be enabled by default because it breaks relatively recent versions of ld.bfd/ld.gold (late 2015). llvm-svn: 267307
*	Relax test using CHECK-DAG instead of CHECK-NEXT	Mehdi Amini	2016-04-24	1	-6/+6
\| \| \| \| \| \| \| \|	It seems we still have some ordering issue in the combined index emission, but I can't figure out why right now. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267306
*	Fix test stability (was sensitive to the path)	Mehdi Amini	2016-04-24	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	This is a fixup for r267304. The test was sensitive to the path in a subtle way: the index in memory is sorted by GUID, which are hashes that include the source filename for local globals. Teresa recently added a directive at the IR level, so we can specify it here to make the test independent of the path. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267305
*	Store and emit original name in combined index	Mehdi Amini	2016-04-23	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: As discussed in D18298, some local globals can't be renamed/promoted (because they have a section, or because they are referenced from inline assembly). To be able to detect naming collision, we need to keep around the "GUID" using their original name without taking the linkage into account. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D19454 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267304
*	Always traverse GlobalVariable initializer when computing the export list	Mehdi Amini	2016-04-23	2	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: We are always importing the initializer for a GlobalVariable. So if a GlobalVariable is in the export-list, we pull in any refs as well. Reviewers: tejohnson Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19102 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 267303
*	DebugInfo: Remove MDString-based type references	Duncan P. N. Exon Smith	2016-04-23	61	-359/+393
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Eliminate DITypeIdentifierMap and make DITypeRef a thin wrapper around DIType*. It is no longer legal to refer to a DICompositeType by its 'identifier:', and DIBuilder no longer retains all types with an 'identifier:' automatically. Aside from the bitcode upgrade, this is mainly removing logic to resolve an MDString-based reference to an actualy DIType. The commits leading up to this have made the implicit type map in DICompileUnit's 'retainedTypes:' field superfluous. This does not remove DITypeRef, DIScopeRef, DINodeRef, and DITypeRefArray, or stop using them in DI-related metadata. Although as of this commit they aren't serving a useful purpose, there are patchces under review to reuse them for CodeView support. The tests in LLVM were updated with deref-typerefs.sh, which is attached to the thread "[RFC] Lazy-loading of debug info metadata": http://lists.llvm.org/pipermail/llvm-dev/2016-April/098318.html llvm-svn: 267296
*	Revert "[AArch64] Fix optimizeCondBranch logic."	Renato Golin	2016-04-23	1	-1/+1
\| \| \| \| \| \|	This reverts commit r267206, as it broke self-hosting on AArch64. llvm-svn: 267294
*	[X86][XOP] Added VPPERM -> BLEND-WITH-ZERO Test	Simon Pilgrim	2016-04-23	1	-0/+9
\| \| \| \| \| \|	Currently failing due to poor blend matching, found whilst investigating PR27472 llvm-svn: 267282
*	Use %T instead of cd'ing to Output directly.	Benjamin Kramer	2016-04-23	1	-1/+1
\| \| \| \| \| \|	%T expands to Output if not configured differently. llvm-svn: 267281
*	[CodeGen] When promoting CTTZ operations to larger type, don't insert a ↵	Craig Topper	2016-04-23	1	-58/+3
\| \| \| \| \| \|	select to detect if the input is zero to return the original size instead of the extended size. Instead just set the first bit in the zero extended part. llvm-svn: 267280
*	[gold] Gate value name discarding under save-temps	Teresa Johnson	2016-04-23	3	-10/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This removes a couple of flags added to control this behavior, and simply keeps all value names when save-temps is specified. Reviewers: rafael Subscribers: llvm-commits, pcc, davide Differential Revision: http://reviews.llvm.org/D19384 llvm-svn: 267279
*	BitcodeWriter: Emit uniqued subgraphs after all distinct nodes	Duncan P. N. Exon Smith	2016-04-23	2	-0/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since forward references for uniqued node operands are expensive (and those for distinct node operands are cheap due to DistinctMDOperandPlaceholder), minimize forward references in uniqued node operands. Moreover, guarantee that when a cycle is broken by a distinct node, none of the uniqued nodes have any forward references. In ValueEnumerator::EnumerateMetadata, enumerate uniqued node subgraphs first, delaying distinct nodes until all uniqued nodes have been handled. This guarantees that uniqued nodes only have forward references when there is a uniquing cycle (since r267276 changed ValueEnumerator::organizeMetadata to partition distinct nodes in front of uniqued nodes as a post-pass). Note that a single uniqued subgraph can hit multiple distinct nodes at its leaves. Ideally these would themselves be emitted in post-order, but this commit doesn't attempt that; I think it requires an extra pass through the edges, which I'm not convinced is worth it (since DistinctMDOperandPlaceholder makes forward references quite cheap between distinct nodes). I've added two testcases: - test/Bitcode/mdnodes-distinct-in-post-order.ll is just like test/Bitcode/mdnodes-in-post-order.ll, except with distinct nodes instead of uniqued ones. This confirms that, in the absence of uniqued nodes, distinct nodes are still emitted in post-order. - test/Bitcode/mdnodes-distinct-nodes-break-cycles.ll is the minimal example where a naive post-order traversal would cause one uniqued node to forward-reference another. IOW, it's the motivating test. llvm-svn: 267278
*	BitcodeWriter: Emit distinct nodes before uniqued nodes	Duncan P. N. Exon Smith	2016-04-23	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an operand of a distinct node hasn't been read yet, the reader can use a DistinctMDOperandPlaceholder. This is much cheaper than forward referencing from a uniqued node. Change ValueEnumerator::organizeMetadata to partition distinct nodes and uniqued nodes to reduce the overhead of cycles broken by distinct nodes. Mehdi measured this for me; this removes most of the RAUW from the importing step of -flto=thin, even after a WIP patch that removes string-based DITypeRefs (introducing many more cycles to the metadata graph). llvm-svn: 267276
*	llvm-objdump: deal with invalid ARM encodings slightly better.	Tim Northover	2016-04-22	2	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before we printed a warning to stderr and left the actual output stream in a mess. This tries to print a .long or .short representation of what we saw (as if there was a data-in-code directive). This isn't guaranteed to restore synchronization in Thumb-mode (if the invalid instruction was supposed to be 32-bits, we may be off-by-16 for the rest of the function). But there's no certain way to deal with that, and it's invalid code anyway (if the data really wasn't an instruction, the user can add proper .data_in_code directives if they care) llvm-svn: 267250
*	MachO: remove weird ARM/Thumb interface from MachOObjectFile	Tim Northover	2016-04-22	4	-5/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Only one consumer (llvm-objdump) actually cared about the fact that there were two triples. Others were actively working around the fact that the Triple returned by getArch might have been invalid. As for llvm-objdump, it needs to be acutely aware of both Triples anyway, so being generic in the exposed API is no benefit. Also rename the version of getArch returning a Triple. Users were having to pass an unwanted nullptr to disambiguate the two, which was nasty. The only functional change here is that armv7m and armv7em object files no longer crash llvm-objdump. llvm-svn: 267249
*	AMDGPU: sext_inreg (srl x, K), vt -> bfe x, K, vt.Size	Matt Arsenault	2016-04-22	1	-21/+123
\| \| \| \|	llvm-svn: 267244
*	Fix llvm/test/CodeGen/ARM/Windows/dbzchk.ll not to check mixed output, take #2.	NAKAMURA Takumi	2016-04-22	1	-2/+2
\| \| \| \|	llvm-svn: 267242
*	llvm-symbolizer: Avoid infinite recursion walking dwos where the dwo ↵	David Blaikie	2016-04-22	3	-0/+10
\| \| \| \| \| \| \| \| \| \| \|	contains a dwo_name attribute The dwo_name was added to dwo files to improve diagnostics in dwp, but it confuses tools that attempt to load any dwo named by a dwo_name, even ones inside dwos. Avoid this by keeping track of whether a unit is already a dwo unit, and if so, not loading further dwos. llvm-svn: 267241
*	AMDGPU: Re-visit nodes in performAndCombine	Matt Arsenault	2016-04-22	2	-9/+12
\| \| \| \| \| \|	This fixes test regressions when i64 loads/stores are made promote. llvm-svn: 267240