bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	Write AArch64 big endian data fixup entries as BE.	Keith Walker	2016-01-20	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	There was support for writing the AArch64 big endian data fixup entries in the .eh_frame section in BE. This is changed to write all such fixup entries in BE with no restriction on the section. This is similar to the existing support for fixup entries for ARM. A test is added to check the length field in the .debug_line section as this is an example of where such a fixup occurs. Differential Revision: http://reviews.llvm.org/D16064 llvm-svn: 258320
*	[AVX512] Adding VPERMB Intrinsics	Michael Zuckerman	2016-01-20	2	-0/+63
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16296 llvm-svn: 258316
*	Proper handling of diamond-like cases in if-conversion	Krzysztof Parzyszek	2016-01-20	1	-0/+43
\| \| \| \| \| \| \| \| \| \| \|	If converter was somewhat careless about "diamond" cases, where there was no join block, or in other words, where the true/false blocks did not have analyzable branches. In such cases, it was possible for it to remove (needed) branches, resulting in a loss of entire basic blocks. Differential Revision: http://reviews.llvm.org/D16156 llvm-svn: 258310
*	AVX512: Store (MOVNTPD, MOVNTPS, MOVNTDQ) using non-temporal hint intrinsic ↵	Igor Breger	2016-01-20	1	-0/+32
\| \| \| \| \| \| \| \|	implementation. Differential Revision: http://reviews.llvm.org/D16350 llvm-svn: 258309
*	[AArch64] Fix two bugs in the .inst directive	Oliver Stannard	2016-01-20	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The AArch64 .inst directive was implemented using EmitIntValue, which resulted in both $x and $d (code and data) mapping symbols being emitted at the same address. This fixes it to only emit the $x mapping symbol. EmitIntValue also emits the value in big-endian order when targeting big-endian systems, but instructions are always emitted in little-endian order for AArch64. Differential Revision: http://reviews.llvm.org/D16349 llvm-svn: 258308
*	[SelectionDAG] Fold more offsets into GlobalAddresses	Dan Gohman	2016-01-20	3	-11/+683
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SelectionDAG previously missed opportunities to fold constants into GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to create `(add GA+c, y)`. This isn't often visible on targets such as X86 which effectively reassociate adds in their complex address-mode folding logic, however it is currently visible on WebAssembly since it currently has very simple address mode folding code that doesn't reassociate anything. This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses at the same times that it folds constants together, so that it doesn't miss any opportunities to perform such folding. Differential Revision: http://reviews.llvm.org/D16090 llvm-svn: 258296
*	[WebAssembly] Tighten up some regexes in some tests.	Dan Gohman	2016-01-20	4	-80/+80
\| \| \| \|	llvm-svn: 258295
*	[WebAssembly] Don't stackify stores across instructions with side effects.	Dan Gohman	2016-01-20	2	-12/+36
\| \| \| \|	llvm-svn: 258285
*	[Inliner/WinEH] Honor implicit nounwinds	Joseph Tremoulet	2016-01-20	1	-0/+455
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Funclet EH tables require that a given funclet have only one unwind destination for exceptional exits. The verifier will therefore reject e.g. two cleanuprets with different unwind dests for the same cleanup, or two invokes exiting the same funclet but to different unwind dests. Because catchswitch has no 'nounwind' variant, and because IR producers are not required to annotate calls which will not unwind as 'nounwind', it is legal to nest a call or an "unwind to caller" catchswitch within a funclet pad that has an unwind destination other than caller; it is undefined behavior for such a call or catchswitch to unwind. Normally when inlining an invoke, calls in the inlined sequence are rewritten to invokes that unwind to the callsite invoke's unwind destination, and "unwind to caller" catchswitches in the inlined sequence are rewritten to unwind to the callsite invoke's unwind destination. However, if such a call or "unwind to caller" catchswitch is located in a callee funclet that has another exceptional exit with an unwind destination within the callee, applying the normal transformation would give that callee funclet multiple unwind destinations for its exceptional exits. There would be no way for EH table generation to determine which is the "true" exit, and the verifier would reject the function accordingly. Add logic to the inliner to detect these cases and leave such calls and "unwind to caller" catchswitches as calls and "unwind to caller" catchswitches in the inlined sequence. This fixes PR26147. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: alexcrichton, llvm-commits Differential Revision: http://reviews.llvm.org/D16319 llvm-svn: 258273
*	AMDGPU/SI: Prevent the DAGCombiner from creating setcc with i1 inputs	Tom Stellard	2016-01-20	3	-2/+69
\| \| \| \| \| \| \| \| \| \|	Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15035 llvm-svn: 258256
*	[MachineSink] Don't break ImplicitNulls	Sanjoy Das	2016-01-20	1	-0/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This teaches MachineSink to not sink instructions that might break the implicit null check optimization that runs later. This should not affect frontends that do not use implicit null checks. Reviewers: aadg, reames, hfinkel, atrick Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D14632 llvm-svn: 258254
*	[X86] Do not run shrink-wrapping on function with split-stack attribute or HiPE	Quentin Colombet	2016-01-19	1	-6/+77
\| \| \| \| \| \| \| \| \| \| \|	calling convention. The implementation of the related callbacks in the x86 backend for such functions are not ready to deal with a prologue block that is not the entry block of the function. This fixes PR26107, but the longer term solution would be to fix those callbacks. llvm-svn: 258221
*	add tests to show missing memset/malloc optimizations (PR25892)	Sanjay Patel	2016-01-19	2	-0/+75
\| \| \| \|	llvm-svn: 258218
*	[MC, COFF] Add .reloc support for WinCOFF	David Majnemer	2016-01-19	2	-0/+46
\| \| \| \| \| \| \|	This adds rudimentary support for a few relocations that we will use for the CodeView debug format. llvm-svn: 258216
*	[X86][SSE] Add VZEXT_MOVL target shuffle decoding.	Simon Pilgrim	2016-01-19	2	-12/+4
\| \| \| \| \| \|	Add support for decoding VZEXT_MOVL target shuffle masks, allowing it to be used as a source in target shuffle combines. llvm-svn: 258215
*	[X86][SSE] Add INSERTPS target shuffle combines.	Simon Pilgrim	2016-01-19	3	-28/+8
\| \| \| \| \| \| \| \| \| \|	As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles. This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements. Differential Revision: http://reviews.llvm.org/D16072 llvm-svn: 258205
*	[SCEV] Fix PR26207	Sanjoy Das	2016-01-19	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cases, the max backedge taken count can be more conservative than the exact backedge taken count (for instance, because ScalarEvolution::getRange is not control-flow sensitive whereas computeExitLimitFromICmp can be). In these cases, computeExitLimitFromCond (specifically the bit that deals with `and` and `or` instructions) can create an ExitLimit instance with a `SCEVCouldNotCompute` max backedge count expression, but a computable exact backedge count expression. This violates an implicit SCEV assumption: a computable exact BE count should imply a computable max BE count. This change - Makes the above implicit invariant explicit by adding an assert to ExitLimit's constructor - Changes `computeExitLimitFromCond` to be more robust around conservative max backedge counts llvm-svn: 258184
*	[AVX512] Adding VPERMT2B and VPERMI2B instruction .	Michael Zuckerman	2016-01-19	1	-0/+239
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16297 llvm-svn: 258161
*	[LibCallSimplifier] use instruction-level fast-math-flags to shrink calls	Sanjay Patel	2016-01-19	1	-20/+18
\| \| \| \| \| \| \|	This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 llvm-svn: 258158
*	[LibCallSimplifier] use instruction-level fast-math-flags to transform ↵	Sanjay Patel	2016-01-19	1	-21/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pow(x, [small integer]) calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 As with D15937, the intent of the patch is to preserve the current behavior of the transform except that we use the pow call's 'fast' attribute as a trigger rather than a function-level attribute. The TODO comment notes a potential follow-on patch that would propagate FMF to the new instructions. Differential Revision: http://reviews.llvm.org/D16122 llvm-svn: 258153
*	[AVX512] Adding VPERMB instruction	Michael Zuckerman	2016-01-19	1	-0/+123
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16294 llvm-svn: 258144
*	[WebAssembly] Rematerialize constants rather than hold them live in registers.	Dan Gohman	2016-01-19	8	-60/+81
\| \| \| \| \| \| \| \| \|	Teach the register stackifier to rematerialize constants that have multiple uses instead of leaving them in registers. In the WebAssembly encoding, it's the same code size to materialize most constants as it is to read a value from a register. llvm-svn: 258142
*	[WebAssembly] Change a FIXME to a TODO in a comment.	Dan Gohman	2016-01-19	1	-1/+1
\| \| \| \|	llvm-svn: 258139
*	[WebAssembly] Re-enable this test, now that interactions with the coalescer ↵	Dan Gohman	2016-01-19	1	-3/+8
\| \| \| \| \| \|	are resolved. llvm-svn: 258138
*	[X86] Add support for "xlat m8"	Marina Yatsina	2016-01-19	1	-0/+4
\| \| \| \| \| \| \| \|	According to x86 spec "xlat m8" is a legal instruction and it is equivalent to "xlatb". Differential Revision: http://reviews.llvm.org/D15150 llvm-svn: 258135
*	Fix constant folding of constant vector GEPs with undef or null as pointer ↵	Manuel Jacob	2016-01-19	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	argument. Reviewers: eddyb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16321 llvm-svn: 258134
*	[X86] Adding support for missing variations of X86 string related instructions	Marina Yatsina	2016-01-19	2	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following are legal according to X86 spec: ins mem, DX outs DX, mem lods mem stos mem scas mem cmps mem, mem movs mem, mem Differential Revision: http://reviews.llvm.org/D14827 llvm-svn: 258132
*	[WebAssembly] Re-enable loop idiom recognition for memcpy et al.	Dan Gohman	2016-01-19	1	-53/+0
\| \| \| \|	llvm-svn: 258125
*	[X86][AVX512]fix dag & add intrinsics for fixupimm	Asaf Badouh	2016-01-19	2	-0/+346
\| \| \| \| \| \| \| \|	cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics Differential Revision: http://reviews.llvm.org/D16313 llvm-svn: 258124
*	[LTO] Restore original linkage of externals prior to splitting	Tobias Edler von Koch	2016-01-18	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This is a companion patch for http://reviews.llvm.org/D16124. Internalized symbols increase the size of strongly-connected components in SCC-based module splitting and thus reduce the amount of parallelism. This patch records the original linkage of non-local symbols prior to internalization and then restores it just before splitting/CodeGen. This is also useful for cases where the linker requires symbols to remain external, for instance, so they can be placed according to linker script rules. It's currently under its own flag (-restore-globals) but should eventually share a common flag with D16124. Reviewers: joker.eph, pcc Subscribers: slarin, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16229 llvm-svn: 258100
*	AMDGPU: Reduce 64-bit SRAs	Matt Arsenault	2016-01-18	3	-20/+30
\| \| \| \|	llvm-svn: 258096
*	AMDGPU: Split 64-bit and of constant up	Matt Arsenault	2016-01-18	3	-51/+399
\| \| \| \| \| \| \| \| \| \|	This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095
*	[X86][AVX2] Ensure integer execution domain for integer blend tests	Simon Pilgrim	2016-01-18	1	-2/+6
\| \| \| \|	llvm-svn: 258094
*	AMDGPU: Generalize shl combine	Matt Arsenault	2016-01-18	1	-0/+47
\| \| \| \| \| \| \|	Reduce 64-bit shl with constant > 32. We already special cased this for the == 32 case, but this also works for any >= 32 constant. llvm-svn: 258092
*	[X86][SSE] Regenerate vector blend commutation tests	Simon Pilgrim	2016-01-18	2	-46/+48
\| \| \| \|	llvm-svn: 258091
*	AMDGPU: Reduce 64-bit lshr by constant to 32-bit	Matt Arsenault	2016-01-18	2	-2/+64
\| \| \| \| \| \|	64-bit shifts are very slow on some subtargets. llvm-svn: 258090
*	[JIT] Add small-code model test for ELF.	Davide Italiano	2016-01-18	1	-0/+15
\| \| \| \| \| \| \| \|	The coverage is almost non-existent, hopefully more will come after this. Differential Revision: http://reviews.llvm.org/D16096 llvm-svn: 258087
*	AMDGPU: Cleanup sra test	Matt Arsenault	2016-01-18	1	-163/+206
\| \| \| \|	llvm-svn: 258086
*	Add to the split module utility an SCC based method which allows not to ↵	Sergei Larin	2016-01-18	7	-0/+313
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	globalize any local variables. Summary: Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios. This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols. Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module). Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org) Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16124 llvm-svn: 258083
*	[X86][AVX2] Broadcast subvectors	Simon Pilgrim	2016-01-18	7	-18/+137
\| \| \| \| \| \| \| \|	AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly. Differential Revision: http://reviews.llvm.org/D16050 llvm-svn: 258081
*	AVX512: Masked store intrinsic implementation.	Igor Breger	2016-01-18	4	-12/+400
\| \| \| \| \| \| \| \|	Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD. Differential Revision: http://reviews.llvm.org/D16271 llvm-svn: 258047
*	AVX512 : Change v8i1 bitconvert GR8 pattern, remove unnecessary movzbl ↵	Igor Breger	2016-01-18	11	-1360/+1126
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	instruction. code example , previous implementation. movzbl %dil, %eax kmovw %eax, %k0 new code kmovw %edi, %k0 Differential Revision: http://reviews.llvm.org/D16287 llvm-svn: 258045
*	[ARM] Operands for PKHTB alias should be swapped	Oliver Stannard	2016-01-18	2	-2/+2
\| \| \| \| \| \| \| \| \|	When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of the input operands needs to be swapped. Differential Revision: http://reviews.llvm.org/D16288 llvm-svn: 258044
*	[IndVars] Fix PR25576	Sanjoy Das	2016-01-17	1	-0/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`LCSSASafePhiForRAUW` as computed was incorrect -- in cases like these (this exact example does not actually trigger the bug): define i32 @f(i32 %n, i1* %c) { entry: br label %outer.loop outer.loop: br label %inner.loop inner.loop: %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ] %iv.inc = add nuw nsw i32 %iv, 1 %tc = udiv i32 %n, 13 %be.cond = icmp ult i32 %iv, %tc br i1 %be.cond, label %inner.loop, label %inner.exit inner.exit: %iv.lcssa = phi i32 [ %iv, %inner.loop ] %outer.be.cond = load volatile i1, i1* %c br i1 %outer.be.cond, label %outer.loop, label %leave leave: %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ] ret i32 %iv.lcssa.lcssa } `LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit value of `%iv` for `%inner.loop` to `%tc` (this can happen due to `SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA. To fix this, instead of computing `SafePhi` with special logic, decide the safety of RAUW directly via `replacementPreservesLCSSAForm`. llvm-svn: 258016
*	[X86][AVX512] Regenerate v1 shuffle tests	Simon Pilgrim	2016-01-17	1	-2/+2
\| \| \| \|	llvm-svn: 258013
*	Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally	Artur Pilipenko	2016-01-17	3	-1/+69
\| \| \| \| \| \| \| \|	Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16226 llvm-svn: 258010
*	[AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics	Michael Zuckerman	2016-01-17	4	-1/+231
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16189 llvm-svn: 258008
*	[AVX512] Adding VPERMQ VPERMPD Intrinsics	Michael Zuckerman	2016-01-17	2	-0/+86
\| \| \| \| \| \|	Differential Revision: http://reviews.llvm.org/D16194 llvm-svn: 258006
*	Remove some stale comments and fix a typo as suggested by David Blaikie in his	Lang Hames	2016-01-17	2	-2/+0
\| \| \| \| \| \| \| \|	review of r257343. Thanks Dave! llvm-svn: 258002
*	[llvm-readobj][ELF] Teach llvm-readobj to show dynamic relocation in REL format	Simon Atanasyan	2016-01-16	2	-0/+71
\| \| \| \| \| \| \| \| \| \|	MIPS 32-bit ABI uses REL relocation record format to save dynamic relocations. The patch teaches llvm-readobj to show dynamic relocations in this format. Differential Revision: http://reviews.llvm.org/D16114 llvm-svn: 258001