bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	R600: Do not predicate vector op	Vincent Lejeune	2013-03-05	1	-0/+2
\| \| \| \|	llvm-svn: 176507
*	Hexagon: Expand addc, adde, subc and sube.	Jyotsna Verma	2013-03-05	1	-0/+23
\| \| \| \|	llvm-svn: 176505
*	Update cmake build.	Benjamin Kramer	2013-03-05	1	-0/+1
\| \| \| \|	llvm-svn: 176501
*	Hexagon: Use MO operand flags to mark constant extended instructions.	Jyotsna Verma	2013-03-05	3	-471/+43
\| \| \| \|	llvm-svn: 176500
*	Hexagon: Add encoding bits to the TFR64 instructions.	Jyotsna Verma	2013-03-05	1	-20/+46
\| \| \| \| \| \|	Set imMoveImm, isAsCheapAsAMove flags for TFRI instructions. llvm-svn: 176499
*	R600: initial scheduler code	Vincent Lejeune	2013-03-05	3	-1/+624
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a skeleton for a pre-RA MachineInstr scheduler strategy. Currently it only tries to expose more parallelism for ALU instructions (this also makes the distribution of GPR channels more uniform and increases the chances of ALU instructions to be packed together in a single VLIW group). Also it tries to reduce clause switching by grouping instruction of the same kind (ALU/FETCH/CF) together. Vincent Lejeune: - Support for VLIW4 Slot assignement - Recomputation of ScheduleDAG to get more parallelism opportunities Tom Stellard: - Fix assertion failure when trying to determine an instruction's slot based on its destination register's class - Fix some compiler warnings Vincent Lejeune: [v2] - Remove recomputation of ScheduleDAG (will be provided in a later patch) - Improve estimation of an ALU clause size so that heuristic does not emit cf instructions at the wrong position. - Make schedule heuristic smarter using SUnit Depth - Take constant read limitations into account Vincent Lejeune: [v3] - Fix some uninitialized values in ConstPair - Add asserts to ensure an ALU slot is always populated llvm-svn: 176498
*	R600: Remove LowerConstCopyPass and lower CONST_COPY right after ISel.	Vincent Lejeune	2013-03-05	5	-228/+11
\| \| \| \| \| \| \|	Maintaining CONST_COPY Instructions until Pre Emit may prevent some ifcvt case and taking them in account for scheduling is difficult for no real benefit. llvm-svn: 176488
*	R600: Turn BUILD_VECTOR into Reg_Sequence	Vincent Lejeune	2013-03-05	2	-1/+29
\| \| \| \| \|	Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 176487
*	R600: CONST_ADDRESS node is not marked as mayLoad anymore	Vincent Lejeune	2013-03-05	1	-1/+1
\| \| \| \| \| \| \| \| \|	Reviewed-by: Tom Stellard <thomas.stellard at amd.com> mayLoad complexify scheduling and does not bring any usefull info as the location is not writeable at all. llvm-svn: 176486
*	R600: Use MUL_IEEE for trig/fdiv intrinsic	Vincent Lejeune	2013-03-05	1	-4/+4
\| \| \| \| \|	Reviewed-by: Tom Stellard <thomas.stellard at amd.com> llvm-svn: 176485
*	R600: Add support for indirect addressing of non default const buffer	Vincent Lejeune	2013-03-05	2	-7/+8
\| \| \| \| \|	NOTE: This is a candidate for the Mesa stable branch. llvm-svn: 176484
*	The current X86 NOP padding uses one long NOP followed by the remainder in	David Sehr	2013-03-05	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \|	one-byte NOPs. If the processor actually executes those NOPs, as it sometimes does with aligned bundling, this can have a performance impact. From my micro-benchmarks run on my one machine, a 15-byte NOP followed by twelve one-byte NOPs is about 20% worse than a 15 followed by a 12. This patch changes NOP emission to emit as many 15-byte (the maximum) as possible followed by at most one shorter NOP. llvm-svn: 176464
*	[mips] Print move instructions.	Akira Hatanaka	2013-03-04	2	-2/+2
\| \| \| \| \| \|	"move $4, $5" is printed instead of "or $4, $5, $zero". llvm-svn: 176455
*	Mips specific inline assembler constraint 'R'	Jack Carter	2013-03-04	1	-0/+5
\| \| \| \| \| \| \|	'R' An address that can be sued in a non-macro load or store. This patch includes a positive test case. llvm-svn: 176452
*	Bypass Slow Divides	Preston Gurd	2013-03-04	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442
*	R600: Clean up datalayout strings so they better match hardware capabilities	Tom Stellard	2013-03-04	2	-12/+23
\| \| \| \|	llvm-svn: 176439
*	Mips ISD typo	Jia Liu	2013-03-04	1	-1/+1
\| \| \| \|	llvm-svn: 176426
*	ARM: Creating a vector from a lane of another.	Jim Grosbach	2013-03-02	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	The VDUP instruction source register doesn't allow a non-constant lane index, so make sure we don't construct a ARM::VDUPLANE node asking it to do so. rdar://13328063 http://llvm.org/bugs/show_bug.cgi?id=13963 llvm-svn: 176413
*	Clean up code format a bit.	Jim Grosbach	2013-03-02	1	-4/+2
\| \| \| \|	llvm-svn: 176412
*	Tidy up. Trailing whitespace.	Jim Grosbach	2013-03-02	1	-7/+7
\| \| \| \|	llvm-svn: 176411
*	ARM NEON: Fix v2f32 float intrinsics	Arnold Schwaighofer	2013-03-02	1	-0/+18
\| \| \| \| \| \|	Mark them as expand, they are not legal as our backend does not match them. llvm-svn: 176410
*	X86 cost model: Adjust cost for custom lowered vector multiplies	Arnold Schwaighofer	2013-03-02	1	-5/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This matters for example in following matrix multiply: int mmult(int rows, int cols, int m1, int m2, int m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403
*	Added FIXME for future Hexagon cleanup.	Andrew Trick	2013-03-02	1	-0/+3
\| \| \| \|	llvm-svn: 176400
*	[mips] Fix inefficient code generation.	Akira Hatanaka	2013-03-01	3	-1/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch eliminates the need to emit a constant move instruction when this pattern is matched: (select (setgt a, Constant), T, F) The pattern above effectively turns into this: (conditional-move (setlt a, Constant + 1), F, T) llvm-svn: 176384
*	Fix indentation.	Akira Hatanaka	2013-03-01	1	-15/+10
\| \| \| \|	llvm-svn: 176380
*	Fix PR10475	Michael Liao	2013-03-01	7	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	- ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364
*	Add support for using non-pic code for arm and thumb1 when emitting the sjlj	Chad Rosier	2013-03-01	1	-10/+21
\| \| \| \| \| \| \| \|	dispatch code. As far as I can tell the thumb2 code is behaving as expected. I was able to compile and run the associated test case for both arm and thumb1. rdar://13066352 llvm-svn: 176363
*	Hexagon: Add constant extender support framework.	Jyotsna Verma	2013-03-01	4	-23/+228
\| \| \| \|	llvm-svn: 176358
*	R600/SI: handle all registers in copyPhysReg v2	Christian Konig	2013-03-01	1	-16/+88
\| \| \| \| \| \| \| \|	v2: based on Michels patch, but now allows copying of all registers sizes. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176346
*	R600/SI: remove S_MOV immediate patterns	Christian Konig	2013-03-01	1	-12/+2
\| \| \| \| \| \| \|	They won't match anyway. Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176345
*	R600/SI: remove GPR*AlignEncode	Christian Konig	2013-03-01	4	-67/+16
\| \| \| \| \| \| \|	It's much easier to specify the encoding with tablegen directly. Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176344
*	R600/SI: fix warning about overloaded virtual	Christian Konig	2013-03-01	1	-0/+1
\| \| \| \| \|	Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176343
*	R600/SI: fix inserting waits for unordered defines	Christian Konig	2013-03-01	1	-2/+21
\| \| \| \| \|	Signed-off-by: Christian König <christian.koenig@amd.com> llvm-svn: 176342
*	GCC thinks that this variable might be used uninitialized (it isn't).	Duncan Sands	2013-03-01	1	-1/+1
\| \| \| \|	llvm-svn: 176341
*	[mips] Remove unused option. Fix 80-column violations.	Akira Hatanaka	2013-03-01	1	-16/+8
\| \| \| \|	llvm-svn: 176330
*	[mips] Add the capability to search delay slot filling instructions in	Akira Hatanaka	2013-03-01	1	-32/+303
\| \| \| \| \| \| \| \|	successor basic blocks. Currently this is off by default. llvm-svn: 176329
*	[mips] Do not add SecondLastInst to list BranchInstrs if there is only one	Akira Hatanaka	2013-03-01	1	-2/+2
\| \| \| \| \| \| \| \|	terminator. No functionality change. llvm-svn: 176326
*	[mips] Define an overloaded version of function MipsInstrInfo::AnalyzeBranchAdd.	Akira Hatanaka	2013-03-01	2	-74/+103
\| \| \| \| \| \| \| \|	This function will be used later when the capability to search delay slot filling instructions in successor blocks is added. No intended functionality changes. llvm-svn: 176325
*	[mips] Add options to disable searching backward and in successor blocks.	Akira Hatanaka	2013-03-01	1	-0/+12
\| \| \| \|	llvm-svn: 176321
*	[mips] Add capability to search in the forward direction for instructions that	Akira Hatanaka	2013-03-01	1	-23/+92
\| \| \| \| \| \| \| \|	can fill the delay slot. Currently, this is off by default. llvm-svn: 176320
*	[mips] Define helper function searchRange	Akira Hatanaka	2013-03-01	1	-9/+29
\| \| \| \| \| \|	No functionality change. llvm-svn: 176318
*	[mips] Rename function findDelayInstr to searchBackward.	Akira Hatanaka	2013-03-01	1	-3/+3
\| \| \| \|	llvm-svn: 176317
*	[mips] Define class MemDefsUses.	Akira Hatanaka	2013-03-01	1	-23/+126
\| \| \| \| \| \| \|	This class tracks dependence between memory instructions using underlying objects of memory operands. llvm-svn: 176313
*	Tidy up; no functional change.	Chad Rosier	2013-02-28	1	-5/+3
\| \| \| \|	llvm-svn: 176288
*	Style; no functional change.	Chad Rosier	2013-02-28	1	-7/+4
\| \| \| \|	llvm-svn: 176285
*	Re-format comments (and check commit access)	Yiannis Tsiouris	2013-02-28	1	-17/+15
\| \| \| \|	llvm-svn: 176270
*	AArch64: remove post-encoder method from FCMP (immediate) instructions.	Tim Northover	2013-02-28	3	-27/+30
\| \| \| \| \| \| \| \|	The work done by the post-encoder (setting architecturally unused bits to 0 as required) can be done by the existing operand that covers the "#0.0". This removes at least one use of the discouraged PostEncoderMethod uses. llvm-svn: 176261
*	AArch64: be more careful resorting to inefficient addressing for weak vars.	Tim Northover	2013-02-28	1	-5/+4
\| \| \| \| \| \| \| \|	If an otherwise weak var is actually defined in this unit, it can't be undefined at runtime so we can use normal global variable sequences (ADRP/ADD) to access it. llvm-svn: 176259
*	AArch64: don't drop GlobalAddress offset when handling extern_weak decls.	Tim Northover	2013-02-28	1	-4/+10
\| \| \| \|	llvm-svn: 176258
*	AArch64: Use cbnz instead of cmp/b.ne pair for atomic operations.	Tim Northover	2013-02-28	2	-23/+17
\| \| \| \|	llvm-svn: 176253