bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	[x86] Teach the new vector shuffle lowering to fall back on AVX-512	Chandler Carruth	2014-09-28	2	-0/+2222
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	vectors. Someone will need to build the AVX512 lowering, which should follow AVX1 and AVX2 very closely for AVX512F and AVX512BW resp. I've added a dummy test which is a port of the v8f32 and v8i32 tests from AVX and AVX2 to v8f64 and v8i64 tests for AVX512F and AVX512BW. Hopefully this is enough information for someone to implement proper lowering here. If not, I'll be happy to help, but right now the AVX-512 support isn't a priority for me. llvm-svn: 218583
*	[x86] Fix the new vector shuffle lowering's use of VSELECT for AVX2	Chandler Carruth	2014-09-28	3	-39/+51
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	lowerings. This was hopelessly broken. First, the x86 backend wants '-1' to be the element value representing true in a boolean vector, and second the operand order for VSELECT is backwards from the actual x86 instructions. To make matters worse, the backend is just using '-1' as the true value to get the high bit to be set. It doesn't actually symbolically map the '-1' to anything. But on x86 this isn't quite how it works: there only the high bit is relevant. As a consequence weird non-'-1' values like 0x80 actually "work" once you flip the operands to be backwards. Anyways, thanks to Hal for helping me sort out what these should be. llvm-svn: 218582
*	Add MachineOperand::ChangeToFPImmediate and setFPImm	Matt Arsenault	2014-09-28	2	-7/+37
\| \| \| \|	llvm-svn: 218579
*	[x86] Fix a really silly bug that I introduced fixing another bug in the	Chandler Carruth	2014-09-28	2	-1/+27
\| \| \| \| \| \| \| \| \| \|	new vector shuffle target DAG combines -- it helps to actually test for the value you want rather than just using an integer in a boolean context. Have I mentioned that I loathe implicit conversions recently? :: sigh :: llvm-svn: 218576
*	[x86] Fix yet another bug in the new vector shuffle lowering's handling	Chandler Carruth	2014-09-28	2	-7/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	of widening masks. We can't widen a zeroing mask unless both elements that would be merged are either zeroed or undef. This is the only way to widen a mask if it has a zeroed element. Also clean up the code here by ordering the checks in a more logical way and by using the symoblic values for undef and zero. I'm actually torn on using the symbolic values because the existing code is littered with the assumption that -1 is undef, and moreover that entries '< 0' are the special entries. While that works with the values given to these constants, using the symbolic constants actually makes it a bit more opaque why this is the case. llvm-svn: 218575
*	WinCOFFObjectWriter.cpp: make write_uint32_le more efficient	Hans Wennborg	2014-09-28	1	-6/+4
\| \| \| \|	llvm-svn: 218574
*	[AArch64] Redundant store instructions should be removed as dead code	James Molloy	2014-09-27	2	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If there is a store followed by a store with the same value to the same location, then the store is dead/noop. It can be removed. This problem is found in spec2006-197.parser. For example, stur w10, [x11, #-4] stur w10, [x11, #-4] Then one of the two stur instructions can be removed. Patch by David Xu! llvm-svn: 218569
*	Fix llvm::huge_valf multiple initializations with Visual C++.	Yaron Keren	2014-09-27	3	-8/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	llvm::huge_valf is defined in a header file, so it is initialized multiple times in every compiled unit upon program startup. With non-VC compilers huge_valf is set to a HUGE_VALF which the compiler can probably optimize out. With VC numeric_limits<float>::infinity() does not return a number but a runtime structure member which therotically may change between calls so the compiler does not optimize out the initialization and it happens many times. It can be easily seen by placing a breakpoint on the initialization line. This patch moves llvm::huge_valf initialization to a source file instead of the header. llvm-svn: 218567
*	[x86] Fix yet another issue with widening vector shuffle elements.	Chandler Carruth	2014-09-27	1	-2/+2
\| \| \| \| \| \| \| \|	I spotted this by inspection when debugging something else, so I have no test case what-so-ever, and am not even sure it is possible to realistically trigger the bug. But this is what was intended here. llvm-svn: 218565
*	Update test case to match minor formatting change introduced in r218563.	Craig Topper	2014-09-27	1	-3/+3
\| \| \| \|	llvm-svn: 218564
*	Reduce code duplication a bit.	Craig Topper	2014-09-27	1	-16/+10
\| \| \| \|	llvm-svn: 218563
*	[x86] Fix terrible bugs everywhere in the new vector shuffle lowering	Chandler Carruth	2014-09-27	2	-37/+107
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	and in the target shuffle combining when trying to widen vector elements. Previously only one of these was correct, and we didn't correctly propagate zeroing target shuffle masks (which have a different sentinel value from undef in non- target shuffle masks now). This isn't just a missed optimization, this caused us to drop zeroing shuffles on the floor and miscompile code. The added test case is one example of that. There are other fixes to the test suite as a consequence of this as well as restoring the undef elements in some of the masks that were lost when I brought sanity to the actual value of the undef and zero sentinels. I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW combining code, but that code really needs to go. It was a nice initial attempt, but it isn't very principled and the recursive shuffle combiner is much more powerful. llvm-svn: 218562
*	[x86] Flip the sentinel values used in the target shuffle mask decoding	Chandler Carruth	2014-09-27	2	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	to significantly more sane sentinels. Notably, everywhere else in the backend's representation of shuffles uses '-1' to represent undef. The target shuffle masks really shouldn't diverge from that, especially as in a few places they are manipulated by shared code. This causes us to lose some undef lanes in various test masks. I want to get these back, but technically it isn't invalid and there are a lot of bugs here so I want to try to establish a saner baseline for fixing some of the bugs by aligning the specific senitnel values used. llvm-svn: 218561
*	Fix TableGen -gen-disassembler output for bit fields with an offset.	Craig Topper	2014-09-27	2	-1/+79
\| \| \| \| \| \| \| \| \|	This fixes bit assignments like this Inst{7-0} = Foo{9-2} Patch by Steve King. llvm-svn: 218560
*	Refactor reciprocal and reciprocal square root estimate into ↵	Sanjay Patel	2014-09-26	5	-241/+215
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	target-independent functions (part 2). This is purely refactoring. No functional changes intended. PowerPC is the only target that is currently using this interface. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) And: z = y / x into: z = y * rcpe(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction along with the number of refinement steps needed to make the estimate usable. Differential Revision: http://reviews.llvm.org/D5484 llvm-svn: 218553
*	Add LLVM_ENABLE_MODULES flag to CMake to enable building with C++ modules.	Richard Smith	2014-09-26	3	-0/+32
\| \| \| \|	llvm-svn: 218551
*	llvm-vtabledump: Further simplification	David Majnemer	2014-09-26	1	-43/+15
\| \| \| \| \| \| \|	Hoist out calls to getSection and getContents. No functional change intended. llvm-svn: 218550
*	Object: BSS/virtual sections don't have contents	David Majnemer	2014-09-26	6	-6/+24
\| \| \| \| \| \| \| \| \| \| \| \|	Users of getSectionContents shouldn't try to pass in BSS or virtual sections. In all instances, this is a bug in the code calling this routine. N.B. Some COFF implementations (like CL) will mark their BSS sections as taking space on disk. This would confuse COFFObjectFile into thinking the section is larger than the file. llvm-svn: 218549
*	clang-format of ChangeStdinToBinary & ChangeStdoutToBinary.	Yaron Keren	2014-09-26	1	-4/+4
\| \| \| \|	llvm-svn: 218547
*	Update llvm-objdump’s Mach-O symbolizer code to print the name of symbol ↵	Kevin Enderby	2014-09-26	2	-1/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stubs. So in fully linked images when a call is made through a stub it now gets a comment like the following in the disassembly: callq 0x100000f6c ## symbol stub for: _printf indicating the call is to a symbol stub and which symbol it is for. This is done for branch reference types and seeing if the branch target is in a stub section and if so using the indirect symbol table entry for that stub and using that symbol table entries symbol name. llvm-svn: 218546
*	Remove definition of LLVM_VERSION_INFO; this macro is not used by any of the	Richard Smith	2014-09-26	2	-6/+0
\| \| \| \| \| \| \| \| \|	files in this directory. If it should be defined anywhere, it should be defined when building lib/LTO/LTOCodeGenerator.cpp, but we've not had it defined there for quite some time, so that doesn't really seem to be very important. (It also would slow down the modules build by creating extra module variants.) llvm-svn: 218544
*	Fix CMake warning CMP0054: don't quote a variable name that is intended to be	Richard Smith	2014-09-26	1	-1/+1
\| \| \| \| \| \|	expanded; future versions of cmake may not expand the variable in this case. llvm-svn: 218543
*	Fix misinterpretation of CMake rule found by a CMake warning (related to ↵	Richard Smith	2014-09-26	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CMP0054). lldb sets the variable SHARED_LIBRARY to 1, which breaks this conditional, because older versions of CMake interpret if ("${t}" STREQUAL "SHARED_LIBRARY") as meaning if ("${t}" STREQUAL "1") in this case. Change the conditional so it does the right thing with both old and new CMakes. llvm-svn: 218542
*	[x86] Fix a moderately terrifying bug in the new 128-bit shuffle logic	Chandler Carruth	2014-09-26	3	-47/+100
\| \| \| \| \| \| \| \| \| \| \|	that managed to elude all of my fuzz testing historically. =/ Something changed to allow this code path to actually be exercised and it was doing bad things. It is especially heavily exercised by the patterns that emerge when doing AVX shuffles that end up lowered through the 128-bit code path. llvm-svn: 218540
*	[IndVar] Don't widen loop compare unless IV user is sign extended.	Chad Rosier	2014-09-26	2	-2/+32
\| \| \| \| \| \|	PR21030 llvm-svn: 218539
*	R600/SI: Use break instead of continue	Matt Arsenault	2014-09-26	1	-1/+1
\| \| \| \| \| \|	If an instruction doesn't have src1, it doesn't have src2 llvm-svn: 218536
*	R600/SI: Add strict check lines to div_scale tests.	Matt Arsenault	2014-09-26	1	-16/+255
\| \| \| \| \| \| \| \| \|	This has weird operand requirements so it's worthwhile to have very strict checks for its operands. Add different combinations of SGPR operands. llvm-svn: 218535
*	R600/SI: Add a note about the order of the operands to div_scale	Matt Arsenault	2014-09-26	1	-0/+6
\| \| \| \|	llvm-svn: 218534
*	R600/SI: Move finding SGPR operand to move to separate function	Matt Arsenault	2014-09-26	2	-63/+71
\| \| \| \|	llvm-svn: 218533
*	R600/SI Allow same SGPR to be used for multiple operands	Matt Arsenault	2014-09-26	2	-5/+128
\| \| \| \| \| \| \| \| \| \| \|	Instead of moving the first SGPR that is different than the first, legalize the operand that requires the fewest moves if one SGPR is used for multiple operands. This saves extra moves and is also required for some instructions which require that the same operand be used for multiple operands. llvm-svn: 218532
*	R600/SI: Partially move operand legalization to post-isel hook.	Matt Arsenault	2014-09-26	9	-81/+52
\| \| \| \| \| \| \| \| \|	Disable the SGPR usage restriction parts of the DAG legalizeOperands. It now should only be doing immediate folding until it can be replaced later. The real legalization work is now done by the other SIInstrInfo::legalizeOperands llvm-svn: 218531
*	R600/SI: Implement findCommutedOpIndices	Matt Arsenault	2014-09-26	2	-1/+36
\| \| \| \| \| \| \| \| \| \| \|	The base implementation of commuteInstruction is used in some cases, but it turns out this has been broken for a long time since modifiers were inserted between the real operands. The base implementation of commuteInstruction also fails on immediates, which also needs to be fixed. llvm-svn: 218530
*	R600/SI: Don't move operands that are required to be SGPRs	Matt Arsenault	2014-09-26	2	-10/+53
\| \| \| \| \| \| \| \|	e.g. v_cndmask_b32 requires the condition operand be an SGPR. If one of the source operands were an SGPR, that would be considered the one SGPR use and the condition operand would be illegally moved. llvm-svn: 218529
*	R600/SI: Don't assert on exotic operand types	Matt Arsenault	2014-09-26	1	-1/+1
\| \| \| \| \| \| \| \| \|	This needs a test, but I'm not sure if it is currently possible and I originally hit it due to a bug. Right now the only global address operands have no reason to be VALU instructions, although it theoretically could be a problem. llvm-svn: 218528
*	R600/SI: Fix using wrong operand indices when commuting	Matt Arsenault	2014-09-26	2	-13/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	No test since the current SIISelLowering::legalizeOperands effectively hides this, and the general uses seem to only fire on SALU instructions which don't have modifiers between the operands. When trying to use legalizeOperands immediately after instruction selection, it now sees a lot more patterns it did not see before which break on this. llvm-svn: 218527
*	R600/SI: Remove apparently dead code in legalizeOperands	Matt Arsenault	2014-09-26	1	-8/+0
\| \| \| \| \| \| \| \|	No tests hit this, and I don't see any way a GlobalAddress node would survive beyond lowering on SI. It it would, the move should probably be inserted by selection. llvm-svn: 218526
*	Ignore annotation function calls in cost computation	David Peixotto	2014-09-26	2	-0/+134
\| \| \| \| \| \| \| \| \| \| \|	The annotation instructions are dropped during codegen and have no impact on size. In some cases, the annotations were preventing the unroller from unrolling a loop because the annotation calls were pushing the cost over the unrolling threshold. Differential Revision: http://reviews.llvm.org/D5335 llvm-svn: 218525
*	[x86] The mnemonic is SHUFPS not SHUPFS. =[ I'm very bad at spelling	Chandler Carruth	2014-09-26	1	-3/+3
\| \| \| \| \| \|	sadly. llvm-svn: 218524
*	[x86] In the new vector shuffle lowering, when trying to do another	Chandler Carruth	2014-09-26	2	-10/+35
\| \| \| \| \| \| \| \|	layer of tie-breaking sorting, it really helps to check that you're in a tie first. =] Otherwise the whole thing cycles infinitely. Test case added, another one found through fuzz testing. llvm-svn: 218523
*	[x86] Fix a large collection of bugs that crept in as I fleshed out the	Chandler Carruth	2014-09-26	3	-13/+113
\| \| \| \| \| \| \| \| \| \| \| \|	AVX support. New test cases included. Note that none of the existing test cases covered these buggy code paths. =/ Also, it is clear from this that SHUFPS and SHUFPD are the most bug prone shuffle instructions in x86. =[ These were all detected by fuzz-testing. (I <3 fuzz testing.) llvm-svn: 218522
*	Elide repeated register operand in Thumb1 instructions	Renato Golin	2014-09-26	2	-1/+95
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes the ARM backend transform 3 operand instructions such as 'adds/subs' to the 2 operand version of the same instruction if the first two register operands are the same. Example: 'adds r0, r0, #1' will is transformed to 'adds r0, #1'. Currently for some instructions such as 'adds' if you try to assemble 'adds r0, r0, #8' for thumb v6m the assembler would throw an error message because the immediate cannot be encoded using 3 bits. The backend should be smart enough to transform the instruction to 'adds r0, #8', which allows for larger immediate constants. Patch by Ranjeet Singh. llvm-svn: 218521
*	[X86][SchedModel] SSE reciprocal square root instruction latencies.	Andrea Di Biagio	2014-09-26	7	-15/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The SSE rsqrt instruction (a fast reciprocal square root estimate) was grouped in the same scheduling IIC_SSE_SQRT* class as the accurate (but very slow) SSE sqrt instruction. For code which uses rsqrt (possibly with newton-raphson iterations) this poor scheduling was affecting performances. This patch splits off the rsqrt instruction from the sqrt instruction scheduling classes and creates new IIC_SSE_RSQER* classes with latency values based on Agner's table. Differential Revision: http://reviews.llvm.org/D5370 Patch by Simon Pilgrim. llvm-svn: 218517
*	Revert "Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a ↵	Frederic Riss	2014-09-26	2	-23/+16
\| \| \| \| \| \| \| \| \| \| \|	single DWARFUnitSection." This reverts commit r218513. Buildbots using libstdc++ issue an error when trying to copy SmallVector<std::unique_ptr<>>. Revert the commit until we have a fix. llvm-svn: 218514
*	Store TypeUnits in a SmallVector<DWARFUnitSection> instead of a single ↵	Frederic Riss	2014-09-26	2	-16/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DWARFUnitSection. Summary: There will be multiple TypeUnits in an unlinked object that will be extracted from different sections. Now that we have DWARFUnitSection that is supposed to represent an input section, we need a DWARFUnitSection<TypeUnit> per input .debug_types section. Once this is done, the interface is homogenous and we can move the Section parsing code into DWARFUnitSection. Reviewers: samsonov, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5482 llvm-svn: 218513
*	Fix unused variable warning added in r218509	Daniel Sanders	2014-09-26	1	-1/+0
\| \| \| \|	llvm-svn: 218510
*	[mips] Generalize the handling of f128 return values to support f128 arguments.	Daniel Sanders	2014-09-26	3	-50/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This will allow us to handle f128 arguments without duplicating code from CCState::AnalyzeFormalArguments() or CCState::AnalyzeCallOperands(). No functional change. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5292 llvm-svn: 218509
*	[AVX512] Added load/store from BW/VL subsets to Register2Memory opcode tables.	Robert Khasanov	2014-09-26	5	-6/+946
\| \| \| \| \| \|	Added lowering tests for these instructions. llvm-svn: 218508
*	llvm-vtabledump: Small cleanup	David Majnemer	2014-09-26	1	-24/+24
\| \| \| \|	llvm-svn: 218505
*	fix a typo in doumentation index.	Jyoti Allur	2014-09-26	1	-1/+1
\| \| \| \|	llvm-svn: 218504
*	llvm-vtabledump: strip trailing NUL bytes	David Majnemer	2014-09-26	2	-3/+5
\| \| \| \|	llvm-svn: 218502