bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[InstCombine] update checks; NFC	Sanjay Patel	2016-11-08	1	-8/+9
\| \| \| \|	llvm-svn: 286255
*	GlobalISel: support selecting fpext/fptrunc instructions on AArch64.	Tim Northover	2016-11-08	1	-0/+51
\| \| \| \|	llvm-svn: 286253
*	Fix PR27500: on MSP430 the branch destination offset is measured in words, ↵	Anton Korobeynikov	2016-11-08	1	-0/+588
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	not bytes. Summary: In addition, the branch instructions will have proper BB destinations, not offsets, like before. Reviewers: asl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D23718 llvm-svn: 286252
*	[X86][SSE] Regenerate test (just adds missing header)	Simon Pilgrim	2016-11-08	1	-0/+1
\| \| \| \|	llvm-svn: 286241
*	[TargetLowering] Fix undef vector element issue with true/false result handling	Simon Pilgrim	2016-11-08	3	-19/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixed an issue with vector usage of TargetLowering::isConstTrueVal / TargetLowering::isConstFalseVal boolean result matching. The comment said we shouldn't handle constant splat vectors with undef elements. But the the actual code was returning false if the build vector contained no undef elements.... This patch now ignores the number of undefs (getConstantSplatNode will return null if the build vector is all undefs). The change has also unearthed a couple of missed opportunities in AVX512 comparison code that will need to be addressed. Differential Revision: https://reviews.llvm.org/D26031 llvm-svn: 286238
*	[JumpThreading] Unfold selects that depend on the same condition	Pablo Barrio	2016-11-08	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: These are good candidates for jump threading. This enables later opts (such as InstCombine) to combine instructions from the selects with instructions out of the selects. SimplifyCFG will fold the select again if unfolding wasn't worth it. Patch by James Molloy and Pablo Barrio. Reviewers: rengolin, haicheng, sebpop Subscribers: jojo, jmolloy, llvm-commits Differential Revision: https://reviews.llvm.org/D26391 llvm-svn: 286236
*	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible	Simon Pilgrim	2016-11-08	4	-1275/+672
\| \| \| \| \| \| \| \| \| \|	This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 llvm-svn: 286233
*	[AArch64] Fix incorrect CSEL node created	Roger Ferrer Ibanez	2016-11-08	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \|	Under -enable-unsafe-fp-math, SELECT_CC lowering in AArch64 transforms floating point comparisons of the form "a == 0.0 ? 0.0 : x" to "a == 0.0 ? a : x". But it incorrectly assumes that 'x' and 'a' have the same type which can lead to a wrong CSEL node that crashes later due to nonsensical copies. Differential Revision: https://reviews.llvm.org/D26394 llvm-svn: 286231
*	[mips] Renable small data section test.	Simon Dardis	2016-11-08	1	-16/+8
\| \| \| \|	llvm-svn: 286230
*	[AVX-512] Add an avx512f without avx512vl command line to vec_fp_to_int.ll ↵	Craig Topper	2016-11-08	1	-129/+454
\| \| \| \| \| \|	and regenerate. This will make a change in a future patch easier to see. NFC llvm-svn: 286216
*	GlobalISel: support selecting G_SELECT on AArch64.	Tim Northover	2016-11-08	1	-0/+44
\| \| \| \|	llvm-svn: 286185
*	GlobalISel: constrain PHI registers on AArch64.	Tim Northover	2016-11-08	1	-0/+40
\| \| \| \| \| \| \| \| \| \|	Self-referencing PHI nodes need their destination operands to be constrained because nothing else is likely to do so. For now we just pick a register class naively. Patch mostly by Ahmed again. llvm-svn: 286183
*	[AArch64] Remove dead check prefixes after r286110. NFC.	Chad Rosier	2016-11-07	1	-3/+3
\| \| \| \|	llvm-svn: 286174
*	[AArch64] Rename test to reflect changes after r286110. NFC.	Chad Rosier	2016-11-07	1	-0/+0
\| \| \| \|	llvm-svn: 286173
*	[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition ↵	Stanislav Mekhanoshin	2016-11-07	2	-3/+49
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	copies Codegen prepare sinks comparisons close to a user is we have only one register for conditions. For AMDGPU we have many SGPRs capable to hold vector conditions. Changed BE to report we have many condition registers. That way IR LICM pass would hoist an invariant comparison out of a loop and codegen prepare will not sink it. With that done a condition is calculated in one block and used in another. Current behavior is to store workitem's condition in a VGPR using v_cndmask and then restore it with yet another v_cmp instruction from that v_cndmask's result. To mitigate the issue a forward propagation of a v_cmp 64 bit result to an user is implemented. Additional side effect of this is that we may consume less VGPRs in a cost of more SGPRs in case if holding of multiple conditions is needed, and that is a clear win in most cases. llvm-svn: 286171
*	[OptDiag, opt-viewer] Save callee's location and display as link	Adam Nemet	2016-11-07	2	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	With this we get a new field in the YAML record if the value being streamed out has a debug location. For examples, please see the changes to the tests. This is then used in opt-viewer to display a link for the callee function in the inlining remarks. Differential Revision: https://reviews.llvm.org/D26366 llvm-svn: 286169
*	[AArch64] Transfer memory operands when lowering vector load/store intrinsics	Sanjin Sijaric	2016-11-07	2	-2/+62
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Some vector loads and stores generated from AArch64 intrinsics alias each other unnecessarily, preventing better scheduling. We just need to transfer memory operands during lowering. Reviewers: mcrosier, t.p.northover, jmolloy Subscribers: aemerson, rengolin, llvm-commits Differential Revision: https://reviews.llvm.org/D26313 llvm-svn: 286168
*	[WebAssembly] Emit a BasePointer when we have overly-aligned stack objects	Derek Schuff	2016-11-07	1	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \|	Because we shift the stack pointer by an unknown amount, we need an additional pointer. In the case where we have variable-size objects as well, we can't reuse the frame pointer, thus three pointers. Patch by Jacob Gravelle Differential Revision: https://reviews.llvm.org/D26263 llvm-svn: 286160
*	Avoid tail recursion elimination across calls with operand bundles	Sanjoy Das	2016-11-07	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In some specific scenarios with well understood operand bundle types (like `"deopt"`) it may be possible to go ahead and convert recursion to iteration, but TailRecursionElimination does not have that logic today so avoid doing the right thing for now. I need some input on whether `"funclet"` operand bundles should also block tail recursion elimination. If not, I'll allow TRE across calls with `"funclet"` operand bundles and add a test case. Reviewers: rnk, majnemer, nlewycky, ahatanak Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D26270 llvm-svn: 286147
*	[tsan] Cast floating-point types correctly when instrumenting atomic ↵	Kuba Brecka	2016-11-07	1	-0/+51
\| \| \| \| \| \| \| \| \| \|	accesses, LLVM part Although rare, atomic accesses to floating-point types seem to be valid, i.e. `%a = load atomic float ...`. The TSan instrumentation pass however tries to emit inttoptr, which is incorrect, we should use a bitcast here. Anyway, IRBuilder already has a convenient helper function for this. Differential Revision: https://reviews.llvm.org/D26266 llvm-svn: 286135
*	AMDGPU: Remove unnecessary and on conditional branch	Matt Arsenault	2016-11-07	9	-34/+16
\| \| \| \| \| \| \|	The comment explaining why this was necessary is incorrect in its description of v_cmp's behavior for inactive workitems. llvm-svn: 286134
*	AMDGPU: Preserve vcc undef flags when inverting branch	Matt Arsenault	2016-11-07	2	-0/+266
\| \| \| \| \| \| \| \| \| \| \| \| \|	If the branch was on a read-undef of vcc, passes that used analyzeBranch to invert the branch condition wouldn't preserve the undef flag resulting in a verifier error. Fixes verifier failures in a future commit. Also fix verifier error when inserting copy for vccz corruption bug. llvm-svn: 286133
*	[MemCpyOpt] Don't emit IR in an unspecified order	Benjamin Kramer	2016-11-07	1	-28/+28
\| \| \| \| \| \| \| \| \|	Argument evaluation order is one of the edge cases where Clang differs from GCC, yielding different IR depending on which compiler LLVM was built with. Make the order deterministic and tune the test to actually verify the order instead of trying to hide it. llvm-svn: 286126
*	Add -O0 support for @llvm.invariant.group.barrier by discarding it if it ↵	Richard Smith	2016-11-07	1	-0/+7
\| \| \| \| \| \| \| \|	gets to ISel. Differential Revision: https://reviews.llvm.org/D26292 llvm-svn: 286119
*	[InstCombine] allow splat vector folds in adjustMinMax() (retry r285732)	Sanjay Patel	2016-11-07	1	-48/+45
\| \| \| \| \| \| \| \|	This was reverted at r285866 because there was a crash handling a scalar select of vectors. I added a check for that pattern and a test case based on the example provided in the post-commit thread for r285732. llvm-svn: 286113
*	This patch adds support for 16 bit floating point registers to the inline ↵	Amara Emerson	2016-11-07	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	asm register selection on AArch64. Without this patch, register allocation for the example below fails. define half @test(half %a1, half %a2) #0 { entry: %0 = tail call half asm "sqrshl ${0:h}, ${1:h}, ${2:h}", "=w,w,w" (half %a1, half %a2) #1 ret half %0 } Patch by Florian Hahn. Differential Revision: https://reviews.llvm.org/D25080 llvm-svn: 286111
*	[AArch64] Removed the narrow load merging code in the ld/st optimizer.	Chad Rosier	2016-11-07	1	-320/+5
\| \| \| \| \| \| \| \|	This feature has been disabled for some time now, so remove cruft. Differential Revision: https://reviews.llvm.org/D26248 llvm-svn: 286110
*	[AliasSetTracker] Make AST smarter about assume intrinsics that don't ↵	Chad Rosier	2016-11-07	1	-0/+19
\| \| \| \| \| \| \| \|	actually affect memory. Differential Revision: https://reviews.llvm.org/D26252 llvm-svn: 286108
*	[Thumb1] Move padding earlier when synthesizing TBBs off of the PC	James Molloy	2016-11-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	When the base register (register pointing to the jump table) is the PC, we expect the jump table to directly follow the jump sequence with no intervening padding. If there is intervening padding, the calculated offsets will not be correct. One solution would be to account for any padding in the emitted LDRB instruction, but at the moment we don't support emitting MCExprs for the load offset. In the meantime, it's correct and only a slight amount worse to just move the padding up, from just before the jump table to just before the jump instruction sequence. We can do that by emitting code alignment before the jump sequence, as we know the number of instructions in the sequence is always 4. llvm-svn: 286107
*	[X86][AVX512] Add AVX512VL/AVX512BWVL vector truncation tests	Simon Pilgrim	2016-11-07	1	-65/+313
\| \| \| \|	llvm-svn: 286105
*	[X86][SSE] Drop unnecessary -mcpu argument from trunc tests	Simon Pilgrim	2016-11-07	1	-7/+7
\| \| \| \| \| \|	cpu/triple duplication llvm-svn: 286104
*	[AVX-512] Remove masked pmovzx/pmovsx builtins and autoupgrade them to ↵	Craig Topper	2016-11-07	8	-779/+778
\| \| \| \| \| \| \| \|	selects and native zext/sext. This mostly reuses earlier autoupgrade support for the sse and avx equivalents. Just needed to add the code to add the select. llvm-svn: 286092
*	[AVX-512] Remove 128/256 masked pshufb intrinsics. Autoupgrade them to ↵	Craig Topper	2016-11-07	2	-32/+33
\| \| \| \| \| \|	legacy intrinsics and a select. llvm-svn: 286089
*	ARM: lower fpowi appropriately for Windows ARM	Saleem Abdulrasool	2016-11-06	1	-0/+57
\| \| \| \| \| \| \| \| \| \| \|	This handles the last case of the builtin function calls that we would generate code which differed from Microsoft's ABI. Rather than generating a call to `__pow{d,s}i2` we now promote the parameter to a float or double and invoke `powf` or `pow` instead. Addresses PR30825! llvm-svn: 286082
*	[SelectionDAG] Add support for vector demandedelts in XOR opcodes	Simon Pilgrim	2016-11-06	1	-10/+2
\| \| \| \|	llvm-svn: 286075
*	[X86] Add knownbits vector xor test	Simon Pilgrim	2016-11-06	1	-0/+31
\| \| \| \| \| \|	In preparation for demandedelts support llvm-svn: 286074
*	[AVX-512] Remove intrinsics for 128/256-bit masked variable shift. Instead ↵	Craig Topper	2016-11-06	2	-215/+215
\| \| \| \| \| \|	upgrade them to a select and the older AVX2 intrinsic. llvm-svn: 286073
*	[AVX-512] Remove intrinsics for 128/256-bit masked shift by immediate. ↵	Craig Topper	2016-11-06	4	-325/+244
\| \| \| \| \| \|	Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286072
*	[SelectionDAG] Add support for vector demandedelts in OR opcodes	Simon Pilgrim	2016-11-06	1	-10/+2
\| \| \| \|	llvm-svn: 286071
*	[AVX-512] Remove intrinsics for 128/256-bit masked shift by single element ↵	Craig Topper	2016-11-06	4	-303/+304
\| \| \| \| \| \|	in xmm. Instead upgrade them to a select and the older SSE/AVX2 intrinsic. llvm-svn: 286070
*	[AVX-512] Remove a 512-bit test cases from the avx512vl test file. It ↵	Craig Topper	2016-11-06	1	-20/+0
\| \| \| \| \| \|	already exists in the avx512f test file. llvm-svn: 286069
*	[X86] Add knownbits vector or test	Simon Pilgrim	2016-11-06	1	-0/+31
\| \| \| \| \| \|	In preparation for demandedelts support llvm-svn: 286068
*	[X86] Add a few more fptoui test cases to the vec_fp_to_int.ll. The codegen ↵	Craig Topper	2016-11-06	1	-0/+135
\| \| \| \| \| \|	for these test cases will be improved for AVX512 in a future commit. llvm-svn: 286063
*	[AVX-512] Add missing EVEX version of pattern for (v2f64 (extloadv2f32 ↵	Craig Topper	2016-11-06	1	-2/+2
\| \| \| \| \| \|	addr:)) -> VCVTPS2PDZ128rm llvm-svn: 286059
*	[AVX-512] Add avx512vl command line to the fpext test and add ↵	Craig Topper	2016-11-06	1	-80/+146
\| \| \| \| \| \|	-show-mc-encoding to show where we aren't using EVEX instructions. llvm-svn: 286058
*	[AVX-512] Lower AVX cvtpd2ps intrinsic to ISD::FP_ROUND so it can use EVEX ↵	Craig Topper	2016-11-06	1	-1/+1
\| \| \| \| \| \|	instruction when available. llvm-svn: 286057
*	[AVX-512] Lower SSE/AVX cvtdq2ps intrinsics directly to ISD::SINT_TO_FP so ↵	Craig Topper	2016-11-06	2	-12/+27
\| \| \| \| \| \|	they can use EVEX instructions when available. llvm-svn: 286056
*	[AVX-512] Add -show-mc-encoding to legacy vector intrinsic tests so we can ↵	Craig Topper	2016-11-06	8	-4060/+4018
\| \| \| \| \| \|	see when VEX or EVEX encoded instructions are being emitted. Make sure the tests all have an avx2 command line and an skx command line. llvm-svn: 286055
*	[LoopStrengthReduce] Don't use a DenseSet<int64_t> when we might add any ↵	Justin Lebar	2016-11-05	1	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	valid int64_t to the set. Summary: SmallSetVector uses DenseSet, but that means we need to reserve some values for the empty and tombstone keys. It seems to me we should have a general way to let us store full-range ints inside of DenseSets, and furthermore that we probably shouldn't silently let you add ints into DenseSets without explicitly promising that they're in range. But that's a battle for another day; for now, just fix this code, since we currently do something Very Bad when compiling ffmpeg. Fixes PR30914. Reviewers: jeremyhu Subscribers: llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D26323 llvm-svn: 286038
*	[Hexagon] Account for <def,read-undef> when validating moves for predication	Krzysztof Parzyszek	2016-11-04	1	-0/+41
\| \| \| \|	llvm-svn: 286009