bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[PPC]: Peephole optimize small accesss to aligned globals.	Kyle Butt	2015-12-11	1	-0/+335
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319
*	[X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1.	Cong Hou	2015-12-11	2	-3/+356
\| \| \| \| \| \| \| \| \| \| \| \|	Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers are counted from the result of running llc on the new test case in this patch. Differential revision: http://reviews.llvm.org/D15132 llvm-svn: 255315
*	Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,	Eric Christopher	2015-12-10	1	-0/+103
\| \| \| \| \| \| \| \| \| \| \| \|	x)) combines for ppc_fp128, since signbit computation is more complicated. Discussion thread: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html Patch by Tim Shen! llvm-svn: 255305
*	PPC: Teach FMA mutate to respect register classes.	Kyle Butt	2015-12-10	1	-0/+89
\| \| \| \| \| \| \| \| \|	This was causing bad code gen and assembly that won't assemble, as mixed altivec and vsx code would end up with a vsx high register assigned to an altivec instruction, which won't work. Constraining the classes allows the optimization to proceed. llvm-svn: 255299
*	EarlyCSE: add tests	JF Bastien	2015-12-10	1	-10/+68
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As a follow-up to rL255054 I wasn't able to convince myself that the code did what I thought, so I wrote more tests. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15371 llvm-svn: 255295
*	[DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extension	Simon Pilgrim	2015-12-10	1	-0/+16
\| \| \| \| \| \|	PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code. llvm-svn: 255289
*	[DSE] Disable non-local DSE to see if the bots go green.	Chad Rosier	2015-12-10	4	-4/+4
\| \| \| \| \| \|	I see a few bots timing out, so I'm speculatively disabling r255247. llvm-svn: 255286
*	Fix another case where the linkage was not set.	Rafael Espindola	2015-12-10	2	-1/+12
\| \| \| \|	llvm-svn: 255272
*	[PGO] Use %t as the temporary profdata filename in the test cases.	Rong Xu	2015-12-10	10	-19/+19
\| \| \| \| \| \|	Using %t rather %T/<specific_name> as the temporary profdata filename. llvm-svn: 255271
*	Fix fptosi, fptoui from f16 vectors to i8, i16 vectors	Pirama Arumuga Nainar	2015-12-10	2	-1/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Convert f16 vectors to corresponding f32 vectors before doing the conversion to int. Add tests for v4f16, v8f16. Reviewers: ab, jmolloy Subscribers: llvm-commits, srhines Differential Revision: http://reviews.llvm.org/D14936 llvm-svn: 255263
*	[InstCombine] fold bitcasts around an extractelement (3rd try)	Sanjay Patel	2015-12-10	1	-8/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261
*	[WebAssembly] Tighten up several CHECK tests.	Dan Gohman	2015-12-10	4	-18/+18
\| \| \| \|	llvm-svn: 255255
*	Slit lib/Linker in two.	Rafael Espindola	2015-12-10	2	-8/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A linker normally has two stages: symbol resolution and "moving stuff". In lib/Linker there is the complication of lazy linking some globals, but it was still far more mixed than it needed to. This splits the linker into a lower level IRMover and the linker proper. The IRMover just takes a list of globals to move and a callback that lets the user control what is lazy linked. The main motivation is that now tools/gold (and soon lld) can use their own symbol resolution to instruct IRMover what to do. llvm-svn: 255254
*	[DeadStoreElimination] Add support for non-local DSE.	Chad Rosier	2015-12-10	5	-0/+151
\| \| \| \| \| \| \| \| \| \| \| \|	We extend the search for redundant stores to predecessor blocks that unconditionally lead to the block BB with the current store instruction. That also includes single-block loops that unconditionally lead to BB, and if-then-else blocks where then- and else-blocks unconditionally lead to BB. http://reviews.llvm.org/D13363 Patch by Ivan Baev <ibaev@codeaurora.org>! llvm-svn: 255247
*	Bitcasts between FP and INT values using direct moves	Nemanja Ivanovic	2015-12-10	2	-2/+116
\| \| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D15286 LLVM IR frequently contains bitcast operations between floating point and integer values of the same width. Doing this through memory operations is quite expensive on PPC. This patch allows the use of direct register moves between FPRs and GPRs for lowering bitcasts. llvm-svn: 255246
*	Macro debug info support in LLVM IR	Amjad Aboud	2015-12-10	2	-12/+22
\| \| \| \| \| \| \| \|	Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros. Differential Revision: http://reviews.llvm.org/D14687 llvm-svn: 255245
*	Revert r255137.	Akira Hatanaka	2015-12-10	1	-20/+8
\| \| \| \| \| \|	This commit broke apple's internal bot. llvm-svn: 255227
*	[WebAssembly] Implement mixed-type ISD::FCOPYSIGN.	Dan Gohman	2015-12-10	1	-0/+28
\| \| \| \| \| \| \| \|	ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner uses this. Add some def : Pat rules to expand this out into an explicit conversion and a normal copysign operation. llvm-svn: 255220
*	[WebAssembly] Implement fma.	Dan Gohman	2015-12-10	2	-0/+18
\| \| \| \| \| \|	It is lowered to a libcall for now, but this is expected to change in the future. llvm-svn: 255219
*	AMDGPU/SI: Emit constant arrays in the .text section	Tom Stellard	2015-12-10	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
*	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints	Tom Stellard	2015-12-10	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203
*	[WebAssembly] Fix legalization of f32->f64 EXTLOAD.	Dan Gohman	2015-12-10	1	-0/+20
\| \| \| \|	llvm-svn: 255202
*	[WebAssembly] Also legalize sign_extend_inreg of i32->i64.	Dan Gohman	2015-12-10	1	-0/+9
\| \| \| \|	llvm-svn: 255191
*	PeepholeOptimizer: Ignore dead implicit defs	Dan Gohman	2015-12-10	1	-0/+28
\| \| \| \| \| \| \| \|	Target-specific instructions may have uninteresting physreg clobbers, for target-specific reasons. The peephole pass doesn't need to concern itself with such defs, as long as they're implicit and marked as dead. llvm-svn: 255182
*	[WebAssembly] Fix legalization of shift operators with illegal types.	Dan Gohman	2015-12-10	1	-0/+24
\| \| \| \|	llvm-svn: 255181
*	[WebAssembly] Implement anyext.	Dan Gohman	2015-12-10	1	-0/+11
\| \| \| \|	llvm-svn: 255179
*	[X86] Enable shrink-wrapping by default, but keep it disabled for stack frames	Quentin Colombet	2015-12-09	3	-6/+159
\| \| \| \| \| \| \| \|	without a frame pointer when unwind may happen. This is a workaround for a bug in the way we emit the CFI directives for frameless unwind information. See PR25614. llvm-svn: 255175
*	Synchronize the logic for deciding to link a gv.	Rafael Espindola	2015-12-09	2	-0/+14
\| \| \| \| \| \| \|	We were deciding to not link an available_externally gv over a declaration, but then copying over the body anyway. llvm-svn: 255169
*	[PGO] Rename the profdata filename to avoid the conflict b/w tests.	Rong Xu	2015-12-09	1	-2/+2
\| \| \| \| \| \| \| \|	Two tests diag_mismatch.ll and diag_no_funcprofdata.ll generates the same profdata filename which can conflict in current test runs. This patch renames them to have different names. llvm-svn: 255158
*	IR: Make ConstantDataArray::getFP actually return a ConstantDataArray	Justin Bogner	2015-12-09	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>) overload has had a typo in it since it was written, where it will create a Vector instead of an Array. This obviously doesn't work at all, but it turns out that until r254991 there weren't actually any callers of this overload. Fix the typo and add some test coverage. llvm-svn: 255157
*	[Float2Int] Don't operate on vector instructions	Reid Kleckner	2015-12-09	1	-0/+10
\| \| \| \| \| \| \|	This fixes a crash bug. It's also not clear if we'd want to do this transform for vectors. llvm-svn: 255155
*	Use WeakVH to keep track of calls with operand bundles in CloneCodeInfo	Sanjoy Das	2015-12-09	1	-0/+31
\| \| \| \| \| \| \| \|	`CloneAndPruneIntoFromInst` can DCE instructions after cloning them into the new function, and so an AssertingVH is too strong. This change switches CloneCodeInfo to use a std::vector<WeakVH>. llvm-svn: 255148
*	[InstCombine] fold bitcasts around an extractelement (2nd try)	Sanjay Patel	2015-12-09	1	-8/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a redo of r255124 (reverted at r255126) with an added check for a scalar destination type and an added test for the failure seen in Clang's test/CodeGen/vector.c. The extra test shows a different missing optimization. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255137
*	[PGO] Resubmit "MST based PGO instrumentation infrastructure" (r254021)	Rong Xu	2015-12-09	19	-0/+572
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new patch fixes a few bugs that exposed in last submit. It also improves the test cases. --Original Commit Message-- This patch implements a minimum spanning tree (MST) based instrumentation for PGO. The use of MST guarantees minimum number of CFG edges getting instrumented. An addition optimization is to instrument the less executed edges to further reduce the instrumentation overhead. The patch contains both the instrumentation and the use of the profile to set the branch weights. Differential Revision: http://reviews.llvm.org/D12781 llvm-svn: 255132
*	Revert "[InstCombine] fold bitcasts around an extractelement"	Mehdi Amini	2015-12-09	1	-5/+8
\| \| \| \| \| \| \| \| \|	This reverts commit r255124. Broke http://lab.llvm.org:8011/builders/llvm-clang-lld-x86_64-scei-ps4-ubuntu-fast/builds/4193/steps/test/logs/stdio From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255126
*	[WebAssembly] Reintroduce ARGUMENT moving logic	Dan Gohman	2015-12-09	6	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Reinteroduce the code for moving ARGUMENTS back to the top of the basic block. While the ARGUMENTS physical register prevents sinking and scheduling from moving them, it does not appear to be sufficient to prevent SelectionDAG from moving them down in the initial schedule. This patch introduces a patch that moves them back to the top immediately after SelectionDAG runs. This is still hopefully a temporary solution. http://reviews.llvm.org/D14750 is one alternative, though the review has not been favorable, and proposed alternatives are longer-term and have other downsides. This fixes the main outstanding -verify-machineinstrs failures, so it adds -verify-machineinstrs to several tests. Differential Revision: http://reviews.llvm.org/D15377 llvm-svn: 255125
*	[InstCombine] fold bitcasts around an extractelement	Sanjay Patel	2015-12-09	1	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255124
*	Change hasUniqueInitializer() to call isStrongDefinitionForLinker() instead ↵	Mehdi Amini	2015-12-09	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of !isWeakForLinker() Summary: Available_externally global variable with initializer were considered "hasInitializer()", while obviously it can't match the description: Whether the global variable has an initializer, and any changes made to the initializer will turn up in the final executable. since modifying the initializer of an externally available variable does not make sense. Reviewers: pcc, rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15351 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255123
*	ARM: don't use a deleted node as the BaseReg in complex pattern.	Tim Northover	2015-12-09	1	-0/+15
\| \| \| \| \| \| \| \| \| \|	We mutated the DAG, which invalidated the node we were trying to use as a base register. Sometimes we got away with it, but other times the node really did get deleted before it was finished with. Should fix PR25733 llvm-svn: 255120
*	Fix cycle in selection DAG introduced by extractelement legalization	Robert Lougher	2015-12-09	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	During selection DAG legalization, extractelement is replaced with a load instruction. To do this, a temporary store to the stack is used unless an existing store is found that can be re-used. If re-using a store, the chain going out of the store must be replaced by the one going out of the new load (this ensures that any stores that must take place after the store happens after the load, else the value might be overwritten before it is loaded). The problem is, if the extractelement index is dependent on the store replacing the chain will introduce a cycle in the selection DAG (the load uses the index, and by replacing the chain we will make the index dependent on the load). To fix this, if the index is dependent on the store, the store is skipped. This is conservative as we may end up creating an unnecessary extra store to the stack. However, the situation is not expected to occur very often. Differential Revision: http://reviews.llvm.org/D15330 llvm-svn: 255114
*	[AArch64] Fix FP16 vector instructions that should only accept low registers	Oliver Stannard	2015-12-09	1	-0/+40
\| \| \| \|	llvm-svn: 255113
*	[mips][ias] Range check uimm10 operands	Daniel Sanders	2015-12-09	5	-15/+26
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15229 llvm-svn: 255112
*	Revert r254897 "[mips][microMIPS] Implement LH, LHE, LHU and LHUE instructions"	Zlatko Buljan	2015-12-09	9	-76/+0
\| \| \| \| \| \| \| \| \|	Commited patch was intended to implement LH, LHE, LHU and LHUE instructions. After commit test-suite failed with error message in the form of: fatal error: error in backend: Cannot select: t124: i32,ch = load<LD2[%d](tbaa=<0x94acc48>), sext from i16> t0, t2, undef:i32 For that reason I decided to revert commit r254897 and make new patch which besides implementation and standard regression tests will also have dedicated tests (CodeGen) for the above error. llvm-svn: 255109
*	Revert "Implement a new pass - LiveDebugValues - to compute the set of live ↵	Mehdi Amini	2015-12-09	8	-723/+5
\| \| \| \| \| \| \| \| \| \| \|	DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933" This reverts commit r255096. Break the bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/16378/ From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 255101
*	Implement a new pass - LiveDebugValues - to compute the set of live ↵	Vikram TV	2015-12-09	8	-5/+723
\| \| \| \| \| \|	DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933 llvm-svn: 255096
*	[AArch64][ARM] Don't base interleaved op legality on type alloc size.	Ahmed Bougacha	2015-12-09	2	-2/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, we think that most types that look like they'd fit in a legal vector type are legal (so, basically, any vector type with a size between 33 and 128 bits, I think, since we use pow2 alignment; e.g., v2i25, v3f32, ...). DataLayout::getTypeAllocSize rounds up based on alignment. When checking for target intrinsic legality, that's not what we want: if rounding makes a difference, the type isn't legal, and the target intrinsics shouldn't be used, as they are always assumed legal. One could make the argument that alloc size is ultimately the most relevant here, since we're dealing with LD/ST intrinsics. That's only true if we did legalize them though; that's a problem for another day. Use DataLayout::getTypeSizeInBits instead of getTypeAllocSizeInBits. Type::getSizeInBits can't be used because that'd gratuitously break pointer vector support. Some of these uses are currently fine, because we only hit them when the type is already known legal (e.g., r114454). Update them for consistency. It's faster to avoid the rounding anyway! llvm-svn: 255089
*	Don't drop attributes when inlining through "deopt" operand bundles	Sanjoy Das	2015-12-09	1	-0/+39
\| \| \| \| \| \| \|	Test case attached (test case also checks that we don't drop the calling convention, but that functionality was correct before this patch). llvm-svn: 255088
*	X86-FMA3: Defined the ExeDomain property for Scalar FMA3 opcodes.	Vyacheslav Klochkov	2015-12-09	4	-67/+67
\| \| \| \| \| \| \|	Reviewer: Simon Pilgrim. Differential Revision: http://reviews.llvm.org/D15317 llvm-svn: 255080
*	[OperandBundles] Have PruneEH work correct with operand bundles.	Sanjoy Das	2015-12-08	1	-0/+26
\| \| \| \| \| \| \| \|	For an invoke with operand bundles, the [op_begin(), op_end()-3] range can contain things other than invoke arguments. This change teaches PruneEH to use arg_begin() and arg_end() explicitly. llvm-svn: 255073
*	Define selection for v4f16, v8f16 scalar_to_vector	Pirama Arumuga Nainar	2015-12-08	2	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This fixes failure when trying to select insertelement <4 x half> undef, half %a, i64 0 which gets transformed to a scalar_to_vector node. The accompanying v4 and v8 tests fail instruction selection without this patch. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15322 llvm-svn: 255072