bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	[IR] Reformulate LLVM's EH funclet IR	David Majnemer	2015-12-12	59	-3151/+1257
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While we have successfully implemented a funclet-oriented EH scheme on top of LLVM IR, our scheme has some notable deficiencies: - catchendpad and cleanupendpad are necessary in the current design but they are difficult to explain to others, even to seasoned LLVM experts. - catchendpad and cleanupendpad are optimization barriers. They cannot be split and force all potentially throwing call-sites to be invokes. This has a noticable effect on the quality of our code generation. - catchpad, while similar in some aspects to invoke, is fairly awkward. It is unsplittable, starts a funclet, and has control flow to other funclets. - The nesting relationship between funclets is currently a property of control flow edges. Because of this, we are forced to carefully analyze the flow graph to see if there might potentially exist illegal nesting among funclets. While we have logic to clone funclets when they are illegally nested, it would be nicer if we had a representation which forbade them upfront. Let's clean this up a bit by doing the following: - Instead, make catchpad more like cleanuppad and landingpad: no control flow, just a bunch of simple operands; catchpad would be splittable. - Introduce catchswitch, a control flow instruction designed to model the constraints of funclet oriented EH. - Make funclet scoping explicit by having funclet instructions consume the token produced by the funclet which contains them. - Remove catchendpad and cleanupendpad. Their presence can be inferred implicitly using coloring information. N.B. The state numbering code for the CLR has been updated but the veracity of it's output cannot be spoken for. An expert should take a look to make sure the results are reasonable. Reviewers: rnk, JosephTremoulet, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D15139 llvm-svn: 255422
*	[X86ISelLowering] Add additional support for multiplication-to-shift conversion.	Chen Li	2015-12-12	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255415
*	Fix test/CodeGen/PowerPC/ppc-shrink-wrapping.ll after r255398	Hal Finkel	2015-12-12	1	-1/+1
\| \| \| \|	llvm-svn: 255414
*	[InstCombine] allow any pair of bitcasts to be combined	Sanjay Patel	2015-12-12	1	-12/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change is discussed in D15392 and should allow us to effectively revert: http://llvm.org/viewvc/llvm-project?view=revision&revision=255261 if we canonicalize bitcasts ahead of extracts. It should be safe to convert any pair of bitcasts into a single bitcast, however, it was mentioned here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20110829/127089.html that we're not allowed to bitcast from an x86_mmx to some other types, but I'm not seeing any failures from that, and we have regression tests in CodeGen/X86 that appear to cover all of those cases. Some day we'll get to remove that MMX wart from LLVM IR completely? Differential Revision: http://reviews.llvm.org/D15468 llvm-svn: 255399
*	[PowerPC] Add Branch Hints for Highly-Biased Branches	Hal Finkel	2015-12-12	1	-0/+135
\| \| \| \| \| \| \| \| \| \| \|	This branch adds hints for highly biased branches on the PPC architecture. Even in absence of profiling information, LLVM will mark code reaching unreachable terminators and other exceptional control flow constructs as highly unlikely to be reached. Patch by Tom Jablin! llvm-svn: 255398
*	Revert rL255391: [X86ISelLowering] Add additional support for ↵	Chen Li	2015-12-12	1	-45/+0
\| \| \| \| \| \| \| \|	multiplication-to-shift conversion. because it broke buildbot. llvm-svn: 255395
*	use FileCheck for better checking	Sanjay Patel	2015-12-12	1	-3/+22
\| \| \| \|	llvm-svn: 255394
*	[WebAssembly] Implement prolog/epilog insertion and FrameIndex elimination	Derek Schuff	2015-12-11	1	-0/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Use the SP32 physical register as the base for FrameIndex lowering. Update it and the __stack_pointer global var in the prolog and epilog. Extend the mapping of virtual registers to wasm locals to include the physical registers. Rather than modify the target-independent PrologEpilogInserter (which asserts that there are no virtual registers left) include a slightly-modified copy for Wasm that does not have this assertion and only clears the virtual registers if scavenging was needed (which of course it isn't for wasm). Differential Revision: http://reviews.llvm.org/D15344 llvm-svn: 255392
*	[X86ISelLowering] Add additional support for multiplication-to-shift conversion.	Chen Li	2015-12-11	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: This patch adds support of conversion (mul x, 2^N + 1) => (add (shl x, N), x) and (mul x, 2^N - 1) => (sub (shl x, N), x) if the multiplication can not be converted to LEA + SHL or LEA + LEA. LLVM has already supported this on ARM, and it should also be useful on X86. Note the patch currently only applies to cases where the constant operand is positive, and I am planing to add another patch to support negative cases after this. Reviewers: craig.topper, RKSimon Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D14603 llvm-svn: 255391
*	SelectionDAG: Match min/max if the scalar operation is legal	Matt Arsenault	2015-12-11	5	-77/+319
\| \| \| \|	llvm-svn: 255388
*	Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics	Hal Finkel	2015-12-11	2	-210/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After much discussion, ending here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html it has been decided that, instead of having the vectorizer directly generate special absdiff and horizontal-add intrinsics, we'll recognize the relevant reduction patterns during CodeGen. Accordingly, these intrinsics are not needed (the operations they represent can be pattern matched, as is already done in some backends). Thus, we're backing these out in favor of the current development work. r248483 - Codegen: Fix llvm.*absdiff semantic. r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation llvm-svn: 255387
*	Add tests for bitcast-bitcast sequences for all scalar/vector permutations	Sanjay Patel	2015-12-11	1	-0/+90
\| \| \| \| \| \|	As noted in http://reviews.llvm.org/D15392 , we should be able to improve this. llvm-svn: 255370
*	[PGO] Revert r255365: solution incomplete, not handling lambda yet	Xinliang David Li	2015-12-11	4	-10/+3
\| \| \| \|	llvm-svn: 255369
*	[PGO] Stop using invalid char in instr variable names.	Xinliang David Li	2015-12-11	4	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before the patch, -fprofile-instr-generate compile will fail if no integrated-as is specified when the file contains any static functions (the -S output is also invalid). This patch fixed the issue. With the change, the index format version will be bumped up by 1. Backward compatibility is preserved with this change. Differential Revision: http://reviews.llvm.org/D15243 llvm-svn: 255365
*	CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()	Matthias Braun	2015-12-11	2	-47/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	computeRegisterLiveness() was broken in that it reported dead for a register even if a subregister was alive. I assume this was because the results of analayzePhysRegs() are hard to understand with respect to subregisters. This commit: Changes the results of analyzePhysRegs (=struct PhysRegInfo) to be clearly understandable, also renames the fields to avoid silent breakage of third-party code (and improve the grammar). Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling it and removing workarounds for the bug. This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033 Differential Revision: http://reviews.llvm.org/D15320 llvm-svn: 255362
*	Revert r255247, r255265, and r255286 due to serious compile-time regressions.	Chad Rosier	2015-12-11	5	-151/+0
\| \| \| \| \| \| \| \|	Revert "[DSE] Disable non-local DSE to see if the bots go green." Revert "[DeadStoreElimination] Use range-based loops. NFC." Revert "[DeadStoreElimination] Add support for non-local DSE." llvm-svn: 255354
*	[dsymutil] Ignore absolute symbols in the debug map	Frederic Riss	2015-12-11	3	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \|	Quoting from the comment added to the code: // Objective-C on i386 uses artificial absolute symbols to // perform some link time checks. Those symbols have a fixed 0 // address that might conflict with real symbols in the object // file. As I cannot see a way for absolute symbols to find // their way into the debug information, let's just ignore those. llvm-svn: 255350
*	[Mem2Reg] Respect optnone	James Molloy	2015-12-11	1	-0/+21
\| \| \| \| \| \| \| \|	Mem2Reg shouldn't be optimizing a function that is marked optnone. There is a test checking this that fails when mem2reg is explicitly added to the standard pass pipeline. llvm-svn: 255336
*	[InstCombine] Make MatchBSwap also match bit reversals	James Molloy	2015-12-11	1	-0/+114
\| \| \| \| \| \|	MatchBSwap has most of the functionality to match bit reversals already. If we switch it from looking at bytes to individual bits and remove a few early exits, we can extend the main recursive function to match any sequence of ORs, ANDs and shifts that assemble a value from different parts of another, base value. Once we have this bit->bit mapping, we can very simply detect if it is appropriate for a bswap or bitreverse. llvm-svn: 255334
*	[PPC]: Peephole optimize small accesss to aligned globals.	Kyle Butt	2015-12-11	1	-0/+335
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Access to aligned globals gives us a chance to peephole optimize nonzero offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't overflow the available displacement. For example: addis 3, 2, b4v@toc@ha addi 4, 3, b4v@toc@l lbz 5, b4v@toc@l(3) ; This is the result of the current peephole lbz 6, 1(4) ; optimizer lbz 7, 2(4) lbz 8, 3(4) If b4v is 4-byte aligned, we can skip using register 4 because we know that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate: addis 3, 2, b4v@toc@ha lbz 4, b4v@toc@l(3) lbz 5, b4v@toc@l+1(3) lbz 6, b4v@toc@l+2(3) lbz 7, b4v@toc@l+3(3) Saving a register and an addition. Larger alignments allow larger structures/arrays to be optimized. llvm-svn: 255319
*	[X86][SSE] Update the cost table for integer-integer conversions on SSE2/SSE4.1.	Cong Hou	2015-12-11	2	-3/+356
\| \| \| \| \| \| \| \| \| \| \| \|	Previously in the conversion cost table there are no entries for integer-integer conversions on SSE2. This will result in imprecise costs for certain vectorized operations. This patch adds those entries for SSE2 and SSE4.1. The cost numbers are counted from the result of running llc on the new test case in this patch. Differential revision: http://reviews.llvm.org/D15132 llvm-svn: 255315
*	Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,	Eric Christopher	2015-12-10	1	-0/+103
\| \| \| \| \| \| \| \| \| \| \| \|	x)) combines for ppc_fp128, since signbit computation is more complicated. Discussion thread: http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html Patch by Tim Shen! llvm-svn: 255305
*	PPC: Teach FMA mutate to respect register classes.	Kyle Butt	2015-12-10	1	-0/+89
\| \| \| \| \| \| \| \| \|	This was causing bad code gen and assembly that won't assemble, as mixed altivec and vsx code would end up with a vsx high register assigned to an altivec instruction, which won't work. Constraining the classes allows the optimization to proceed. llvm-svn: 255299
*	EarlyCSE: add tests	JF Bastien	2015-12-10	1	-10/+68
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: As a follow-up to rL255054 I wasn't able to convince myself that the code did what I thought, so I wrote more tests. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D15371 llvm-svn: 255295
*	[DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extension	Simon Pilgrim	2015-12-10	1	-0/+16
\| \| \| \| \| \|	PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code. llvm-svn: 255289
*	[DSE] Disable non-local DSE to see if the bots go green.	Chad Rosier	2015-12-10	4	-4/+4
\| \| \| \| \| \|	I see a few bots timing out, so I'm speculatively disabling r255247. llvm-svn: 255286
*	Fix another case where the linkage was not set.	Rafael Espindola	2015-12-10	2	-1/+12
\| \| \| \|	llvm-svn: 255272
*	[PGO] Use %t as the temporary profdata filename in the test cases.	Rong Xu	2015-12-10	10	-19/+19
\| \| \| \| \| \|	Using %t rather %T/<specific_name> as the temporary profdata filename. llvm-svn: 255271
*	Fix fptosi, fptoui from f16 vectors to i8, i16 vectors	Pirama Arumuga Nainar	2015-12-10	2	-1/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: Convert f16 vectors to corresponding f32 vectors before doing the conversion to int. Add tests for v4f16, v8f16. Reviewers: ab, jmolloy Subscribers: llvm-commits, srhines Differential Revision: http://reviews.llvm.org/D14936 llvm-svn: 255263
*	[InstCombine] fold bitcasts around an extractelement (3rd try)	Sanjay Patel	2015-12-10	1	-8/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a redo of r255137 (reverted at r255227) which was a redo of r255124 (reverted at r255126) with a fixed check for a scalar source type and an added test for the failure that caused the revert. Original commit message: Example: bitcast (extractelement (bitcast <2 x float> %X to <2 x i32>), 1) to float ---> extractelement <2 x float> %X, i32 1 This is part of fixing PR25543: https://llvm.org/bugs/show_bug.cgi?id=25543 The next step will be to generalize this fold: trunc ( lshr ( bitcast X) ) -> extractelement (X) Ie, I'm hoping to replace the existing transform of: bitcast ( trunc ( lshr ( bitcast X))) added by: http://reviews.llvm.org/rL112232 with 2 less specific transforms to catch the case in the bug report. Differential Revision: http://reviews.llvm.org/D14879 llvm-svn: 255261
*	[WebAssembly] Tighten up several CHECK tests.	Dan Gohman	2015-12-10	4	-18/+18
\| \| \| \|	llvm-svn: 255255
*	Slit lib/Linker in two.	Rafael Espindola	2015-12-10	2	-8/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A linker normally has two stages: symbol resolution and "moving stuff". In lib/Linker there is the complication of lazy linking some globals, but it was still far more mixed than it needed to. This splits the linker into a lower level IRMover and the linker proper. The IRMover just takes a list of globals to move and a callback that lets the user control what is lazy linked. The main motivation is that now tools/gold (and soon lld) can use their own symbol resolution to instruct IRMover what to do. llvm-svn: 255254
*	[DeadStoreElimination] Add support for non-local DSE.	Chad Rosier	2015-12-10	5	-0/+151
\| \| \| \| \| \| \| \| \| \| \| \|	We extend the search for redundant stores to predecessor blocks that unconditionally lead to the block BB with the current store instruction. That also includes single-block loops that unconditionally lead to BB, and if-then-else blocks where then- and else-blocks unconditionally lead to BB. http://reviews.llvm.org/D13363 Patch by Ivan Baev <ibaev@codeaurora.org>! llvm-svn: 255247
*	Bitcasts between FP and INT values using direct moves	Nemanja Ivanovic	2015-12-10	2	-2/+116
\| \| \| \| \| \| \| \| \| \| \| \|	This patch corresponds to review: http://reviews.llvm.org/D15286 LLVM IR frequently contains bitcast operations between floating point and integer values of the same width. Doing this through memory operations is quite expensive on PPC. This patch allows the use of direct register moves between FPRs and GPRs for lowering bitcasts. llvm-svn: 255246
*	Macro debug info support in LLVM IR	Amjad Aboud	2015-12-10	2	-12/+22
\| \| \| \| \| \| \| \|	Introduced DIMacro and DIMacroFile debug info metadata in the LLVM IR to support macros. Differential Revision: http://reviews.llvm.org/D14687 llvm-svn: 255245
*	Revert r255137.	Akira Hatanaka	2015-12-10	1	-20/+8
\| \| \| \| \| \|	This commit broke apple's internal bot. llvm-svn: 255227
*	[WebAssembly] Implement mixed-type ISD::FCOPYSIGN.	Dan Gohman	2015-12-10	1	-0/+28
\| \| \| \| \| \| \| \|	ISD::FCOPYSIGN permits its operands to have differing types, and DAGCombiner uses this. Add some def : Pat rules to expand this out into an explicit conversion and a normal copysign operation. llvm-svn: 255220
*	[WebAssembly] Implement fma.	Dan Gohman	2015-12-10	2	-0/+18
\| \| \| \| \| \|	It is lowered to a libcall for now, but this is expected to change in the future. llvm-svn: 255219
*	AMDGPU/SI: Emit constant arrays in the .text section	Tom Stellard	2015-12-10	1	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
*	AMDGPU/SI: Add support for sgpr and vgpr inline assembly constraints	Tom Stellard	2015-12-10	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \|	Summary: The 's' constraint represents sgprs and the 'v' constraint represents vgprs. Reviewers: arsenm, echristo Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15342 llvm-svn: 255203
*	[WebAssembly] Fix legalization of f32->f64 EXTLOAD.	Dan Gohman	2015-12-10	1	-0/+20
\| \| \| \|	llvm-svn: 255202
*	[WebAssembly] Also legalize sign_extend_inreg of i32->i64.	Dan Gohman	2015-12-10	1	-0/+9
\| \| \| \|	llvm-svn: 255191
*	PeepholeOptimizer: Ignore dead implicit defs	Dan Gohman	2015-12-10	1	-0/+28
\| \| \| \| \| \| \| \|	Target-specific instructions may have uninteresting physreg clobbers, for target-specific reasons. The peephole pass doesn't need to concern itself with such defs, as long as they're implicit and marked as dead. llvm-svn: 255182
*	[WebAssembly] Fix legalization of shift operators with illegal types.	Dan Gohman	2015-12-10	1	-0/+24
\| \| \| \|	llvm-svn: 255181
*	[WebAssembly] Implement anyext.	Dan Gohman	2015-12-10	1	-0/+11
\| \| \| \|	llvm-svn: 255179
*	[X86] Enable shrink-wrapping by default, but keep it disabled for stack frames	Quentin Colombet	2015-12-09	3	-6/+159
\| \| \| \| \| \| \| \|	without a frame pointer when unwind may happen. This is a workaround for a bug in the way we emit the CFI directives for frameless unwind information. See PR25614. llvm-svn: 255175
*	Synchronize the logic for deciding to link a gv.	Rafael Espindola	2015-12-09	2	-0/+14
\| \| \| \| \| \| \|	We were deciding to not link an available_externally gv over a declaration, but then copying over the body anyway. llvm-svn: 255169
*	[PGO] Rename the profdata filename to avoid the conflict b/w tests.	Rong Xu	2015-12-09	1	-2/+2
\| \| \| \| \| \| \| \|	Two tests diag_mismatch.ll and diag_no_funcprofdata.ll generates the same profdata filename which can conflict in current test runs. This patch renames them to have different names. llvm-svn: 255158
*	IR: Make ConstantDataArray::getFP actually return a ConstantDataArray	Justin Bogner	2015-12-09	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	The ConstantDataArray::getFP(LLVMContext &, ArrayRef<uint16_t>) overload has had a typo in it since it was written, where it will create a Vector instead of an Array. This obviously doesn't work at all, but it turns out that until r254991 there weren't actually any callers of this overload. Fix the typo and add some test coverage. llvm-svn: 255157
*	[Float2Int] Don't operate on vector instructions	Reid Kleckner	2015-12-09	1	-0/+10
\| \| \| \| \| \| \|	This fixes a crash bug. It's also not clear if we'd want to do this transform for vectors. llvm-svn: 255155