bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	R600/SI: Fix incorrect encoding of DS_WRITE_B32 instructions	Tom Stellard	2013-08-16	1	-2/+2
\| \| \| \| \| \| \| \| \|	The SIInsertWaits pass was overwriting the first operand (gds bit) of DS_WRITE_B32 with the second operand (value to write). This meant that any time the value to write was stored in an odd number VGPR, the gds bit would be set causing the instruction to write to GDS instead of LDS. llvm-svn: 188522
*	R600: Add support for global vector loads with element types less than 32-bits	Tom Stellard	2013-08-16	1	-0/+176
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188521
*	R600: Add support for global vector stores with elements less than 32-bits	Tom Stellard	2013-08-16	1	-0/+62
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188520
*	R600: Add support for i16 and i8 global stores	Tom Stellard	2013-08-16	2	-2/+61
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188519
*	R600: Add support for v4i32 stores on Cayman	Tom Stellard	2013-08-16	2	-1/+15
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188518
*	R600: Enable folding of inline literals into REQ_SEQUENCE instructions	Tom Stellard	2013-08-16	1	-0/+13
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188517
*	R600: Change the RAT instruction assembly names so they match the docs	Tom Stellard	2013-08-16	5	-25/+25
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188515
*	[tests] Cleanup initialization of test suffixes.	Daniel Dunbar	2013-08-16	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Instead of setting the suffixes in a bunch of places, just set one master list in the top-level config. We now only modify the suffix list in a few suites that have one particular unique suffix (.ml, .mc, .yaml, .td, .py). - Aside from removing the need for a bunch of lit.local.cfg files, this enables 4 tests that were inadvertently being skipped (one in Transforms/BranchFolding, a .s file each in DebugInfo/AArch64 and CodeGen/PowerPC, and one in CodeGen/SI which is now failing and has been XFAILED). - This commit also fixes a bunch of config files to use config.root instead of older copy-pasted code. llvm-svn: 188513
*	R600/SI: Improve legalization of vector operations	Tom Stellard	2013-08-14	1	-0/+111
\| \| \| \| \| \|	This should fix hangs in the OpenCL piglit tests. llvm-svn: 188431
*	R600/SI: Replace v1i32 type with i32 in imageload and sample intrinsics	Tom Stellard	2013-08-14	1	-0/+17
\| \| \| \|	llvm-svn: 188430
*	R600/SI: Convert v16i8 resource descriptors to i128	Tom Stellard	2013-08-14	2	-34/+34
\| \| \| \| \| \| \| \| \| \| \| \| \|	Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429
*	R600/SI: Use i8 types for resource descriptors in tests	Tom Stellard	2013-08-14	4	-62/+62
\| \| \| \| \| \| \|	We switched from i32 to i8 types a while ago and the tests were never updated. llvm-svn: 188428
*	R600/SI: Lower BUILD_VECTOR to REG_SEQUENCE v2	Tom Stellard	2013-08-14	2	-1/+51
\| \| \| \| \| \| \| \| \| \| \| \|	Using REG_SEQUENCE for BUILD_VECTOR rather than a series of INSERT_SUBREG instructions should make it easier for the register allocator to coalasce unnecessary copies. v2: - Use an SGPR register class if all the operands of BUILD_VECTOR are SGPRs. llvm-svn: 188427
*	R600/SI: Assign a register class to the $vaddr operand for MIMG instructions	Tom Stellard	2013-08-14	1	-0/+44
\| \| \| \| \| \| \|	The previous code declared the operand as unknown:$vaddr, which made it possible for scalar registers to be used instead of vector registers. llvm-svn: 188425
*	R600/SI: Handle MSAA texture targets	Tom Stellard	2013-08-14	1	-1/+1
\| \| \| \| \| \| \|	Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188421
*	R600/SI: Allow conversion between v32i8 and v8i32	Tom Stellard	2013-08-14	1	-0/+21
\| \| \| \| \| \| \|	Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188420
*	R600/SI: Add pattern for fp_to_uint	Tom Stellard	2013-08-14	1	-9/+18
\| \| \| \| \| \| \| \| \|	This fixes the F2U opcode for the Mesa driver. Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188418
*	R600: Set scheduling preference to Sched::Source	Tom Stellard	2013-08-12	8	-8/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	R600 doesn't need to do any scheduling on the SelectionDAG now that it has a very good MachineScheduler. Also, using the VLIW SelectionDAG scheduler was having a major impact on compile times. For example with the phatk kernel here are the LLVM IR to machine code compile times: With Sched::VLIW Total Compile Time: 1.4890 Seconds (User + System) SelectionDAG Instruction Scheduling: 1.1670 Seconds (User + System) With Sched::Source Total Compile Time: 0.3330 Seconds (User + System) SelectionDAG Instruction Scheduling: 0.0070 Seconds (User + System) The code ouput was identical with both schedulers. This may not be true for all programs, but it gives me confidence that there won't be much reduction, if any, in code quality by using Sched::Source. llvm-svn: 188215
*	R600/SI: FMA is faster than fmul and fadd for f64	Niels Ole Salscheider	2013-08-10	1	-0/+31
\| \| \| \|	llvm-svn: 188136
*	R600/SI: Add FMA pattern	Niels Ole Salscheider	2013-08-10	1	-0/+31
\| \| \| \|	llvm-svn: 188135
*	R600/SI: Implement fp32<->fp64 conversions	Niels Ole Salscheider	2013-08-08	2	-0/+18
\| \| \| \|	llvm-svn: 187988
*	R600/SI: Implement sint<->fp64 conversions	Niels Ole Salscheider	2013-08-08	2	-0/+18
\| \| \| \|	llvm-svn: 187987
*	R600/SI: Use VSrc_* register classes as the default classes for types	Tom Stellard	2013-08-06	1	-0/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831
*	R600/SI: Add more special cases for opcodes to ensureSRegLimit()	Tom Stellard	2013-08-06	6	-45/+45
\| \| \| \| \| \|	Also factor out the register class lookup to its own function. llvm-svn: 187830
*	Factor FlattenCFG out from SimplifyCFG	Tom Stellard	2013-08-06	2	-0/+115
\| \| \| \| \| \|	Patch by: Mei Ye llvm-svn: 187764
*	R600/SI: Add missing test for r187749	Tom Stellard	2013-08-05	1	-0/+48
\| \| \| \|	llvm-svn: 187754
*	R600: Add 64-bit float load/store support	Tom Stellard	2013-08-01	15	-43/+161
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Added R600_Reg64 class * Added T#Index#.XY registers definition * Added v2i32 register reads from parameter and global space * Added f32 and i32 elements extraction from v2f32 and v2i32 * Added v2i32 -> v2f32 conversions Tom Stellard: - Mark vec2 operations as expand. The addition of a vec2 register class made them all legal. Patch by: Dmitry Cherkassov Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com> llvm-svn: 187582
*	R600: Use 64-bit alignment for 64-bit kernel arguments	Tom Stellard	2013-08-01	1	-0/+2
\| \| \| \|	llvm-svn: 187581
*	R600/SI: Custom lower i64 ZERO_EXTEND	Tom Stellard	2013-08-01	1	-0/+18
\| \| \| \|	llvm-svn: 187580
*	Revert "R600: Non vector only instruction can be scheduled on trans unit"	Tom Stellard	2013-07-31	25	-185/+73
\| \| \| \| \| \|	This reverts commit 98ce62780ea7185ba710868bf83c8077e8d7f6d6. llvm-svn: 187526
*	R600: Avoid more than 4 literals in the same instruction group at scheduling	Vincent Lejeune	2013-07-31	1	-0/+68
\| \| \| \|	llvm-svn: 187515
*	R600: Non vector only instruction can be scheduled on trans unit	Vincent Lejeune	2013-07-31	25	-73/+185
\| \| \| \|	llvm-svn: 187514
*	R600/SI: Expand vector fp <-> int conversions	Tom Stellard	2013-07-30	4	-36/+36
\| \| \| \|	llvm-svn: 187421
*	[R600] Replicate old DAGCombiner behavior in target specific DAG combine.	Quentin Colombet	2013-07-30	1	-1/+0
\| \| \| \| \| \| \|	build_vector is lowered to REG_SEQUENCE, which is something the register allocator does a good job at optimizing. llvm-svn: 187397
*	[DAGCombiner] insert_vector_elt: Avoid building a vector twice.	Quentin Colombet	2013-07-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch prevents the following combine when the input vector is used more than once. insert_vector_elt (build_vector elt0, ..., eltN), NewEltIdx, idx => build_vector elt0, ..., NewEltIdx, ..., eltN The reasons are: - Building a vector may be expensive, so try to reuse the existing part of a vector instead of creating a new one (think big vectors). - elt0 to eltN now have two users instead of one. This may prevent some other optimizations. llvm-svn: 187396
*	DAGCombiner: Pass the correct type to TargetLowering::isF(Abs\|Neg)Free	Tom Stellard	2013-07-23	3	-186/+38
\| \| \| \| \| \| \|	This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007
*	R600: Treat CONSTANT_ADDRESS loads like GLOBAL_ADDRESS loads when necessary	Tom Stellard	2013-07-23	1	-25/+122
\| \| \| \| \| \| \| \| \| \|	These are really the same address space in hardware. The only difference is that CONSTANT_ADDRESS uses a special cache for faster access. When we are unable to use the constant kcache for some reason (e.g. smaller types or lack of indirect addressing) then the instruction selector must use GLOBAL_ADDRESS loads instead. llvm-svn: 187006
*	R600: Add support for 24-bit MAD instructions	Tom Stellard	2013-07-23	2	-0/+90
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186923
*	R600: Add support for 24-bit MUL instructions	Tom Stellard	2013-07-23	2	-0/+84
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186922
*	R600: Improve support for < 32-bit loads	Tom Stellard	2013-07-23	2	-14/+67
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186921
*	R600: Move CONST_ADDRESS folding into AMDGPUDAGToDAGISel::Select()	Tom Stellard	2013-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This increases the number of opportunites we have for folding. With the previous implementation we were unable to fold into any instructions other than the first when multiple instructions were selected from a single SDNode. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186919
*	R600: Use KCache for kernel arguments	Tom Stellard	2013-07-23	17	-90/+86
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186918
*	R600: Use the same compute kernel calling convention for all GPUs	Tom Stellard	2013-07-23	1	-2/+2
\| \| \| \| \| \| \| \|	A side-effect of this is that now the compiler expects kernel arguments to be 4-byte aligned. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186916
*	R600: Use correct LoadExtType when lowering kernel arguments	Tom Stellard	2013-07-23	1	-0/+19
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186915
*	R600: Clean up extended load patterns	Tom Stellard	2013-07-23	2	-1/+2
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186914
*	R600: Expand vector FNEG	Tom Stellard	2013-07-23	1	-0/+26
\| \| \| \|	llvm-svn: 186913
*	R600: Don't emit empty then clause and use alu_pop_after	Vincent Lejeune	2013-07-19	3	-6/+129
\| \| \| \|	llvm-svn: 186725
*	R600/SI: Fix crash with VSELECT	Tom Stellard	2013-07-18	1	-0/+15
\| \| \| \| \| \|	https://bugs.freedesktop.org/show_bug.cgi?id=66175 llvm-svn: 186616
*	R600/SI: Add support for v2f32 loads	Tom Stellard	2013-07-18	1	-0/+14
\| \| \| \|	llvm-svn: 186615
*	R600/SI: Add support for v2f32 stores	Tom Stellard	2013-07-18	1	-0/+18
\| \| \| \|	llvm-svn: 186614