bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	R600: Enable folding of inline literals into REQ_SEQUENCE instructions	Tom Stellard	2013-08-16	2	-17/+23
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188517
*	R600: Add IsExport bit to TableGen instruction definitions	Tom Stellard	2013-08-16	6	-10/+16
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188516
*	R600: Change the RAT instruction assembly names so they match the docs	Tom Stellard	2013-08-16	2	-32/+35
\| \| \| \| \|	Tested-by: Aaron Watry <awatry@gmail.com> llvm-svn: 188515
*	Fix spelling	Matt Arsenault	2013-08-15	1	-7/+7
\| \| \| \|	llvm-svn: 188506
*	Tentative fix for global-buffer-overflow caused by r188426. Found by ↵	Alexey Samsonov	2013-08-15	1	-1/+4
\| \| \| \| \| \|	AddressSanitizer llvm-svn: 188448
*	R600/SI: Improve legalization of vector operations	Tom Stellard	2013-08-14	4	-5/+56
\| \| \| \| \| \|	This should fix hangs in the OpenCL piglit tests. llvm-svn: 188431
*	R600/SI: Replace v1i32 type with i32 in imageload and sample intrinsics	Tom Stellard	2013-08-14	4	-4/+18
\| \| \| \|	llvm-svn: 188430
*	R600/SI: Convert v16i8 resource descriptors to i128	Tom Stellard	2013-08-14	12	-44/+280
\| \| \| \| \| \| \| \| \| \| \| \| \|	Now that compute support is better on SI, we can't continue using v16i8 for descriptors since this is also a legal type in OpenCL. This patch fixes numerous hangs with the piglit OpenCL test and since we now use a target specific DAG node for LOAD_CONSTANT with the correct MemOperandFlags, this should also fix: https://bugs.freedesktop.org/show_bug.cgi?id=66805 llvm-svn: 188429
*	R600/SI: Lower BUILD_VECTOR to REG_SEQUENCE v2	Tom Stellard	2013-08-14	7	-111/+86
\| \| \| \| \| \| \| \| \| \| \| \|	Using REG_SEQUENCE for BUILD_VECTOR rather than a series of INSERT_SUBREG instructions should make it easier for the register allocator to coalasce unnecessary copies. v2: - Use an SGPR register class if all the operands of BUILD_VECTOR are SGPRs. llvm-svn: 188427
*	R600/SI: Choose the correct MOV instruction for copying immediates	Tom Stellard	2013-08-14	5	-0/+69
\| \| \| \| \| \| \| \|	The instruction selector will now try to infer the destination register so it can decided whether to use V_MOV_B32 or S_MOV_B32 when copying immediates. llvm-svn: 188426
*	R600/SI: Assign a register class to the $vaddr operand for MIMG instructions	Tom Stellard	2013-08-14	7	-60/+109
\| \| \| \| \| \| \|	The previous code declared the operand as unknown:$vaddr, which made it possible for scalar registers to be used instead of vector registers. llvm-svn: 188425
*	R600/SI: Handle MSAA texture targets	Tom Stellard	2013-08-14	2	-2/+33
\| \| \| \| \| \| \|	Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188421
*	R600/SI: Allow conversion between v32i8 and v8i32	Tom Stellard	2013-08-14	2	-2/+7
\| \| \| \| \| \| \|	Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188420
*	R600/SI: Fix an obvious typo	Tom Stellard	2013-08-14	1	-1/+1
\| \| \| \| \| \| \|	Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188419
*	R600/SI: Add pattern for fp_to_uint	Tom Stellard	2013-08-14	1	-1/+3
\| \| \| \| \| \| \| \| \|	This fixes the F2U opcode for the Mesa driver. Patch by: Marek Olšák Signed-off-by: Marek Olšák <marek.olsak@amd.com> llvm-svn: 188418
*	R600: Set scheduling preference to Sched::Source	Tom Stellard	2013-08-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	R600 doesn't need to do any scheduling on the SelectionDAG now that it has a very good MachineScheduler. Also, using the VLIW SelectionDAG scheduler was having a major impact on compile times. For example with the phatk kernel here are the LLVM IR to machine code compile times: With Sched::VLIW Total Compile Time: 1.4890 Seconds (User + System) SelectionDAG Instruction Scheduling: 1.1670 Seconds (User + System) With Sched::Source Total Compile Time: 0.3330 Seconds (User + System) SelectionDAG Instruction Scheduling: 0.0070 Seconds (User + System) The code ouput was identical with both schedulers. This may not be true for all programs, but it gives me confidence that there won't be much reduction, if any, in code quality by using Sched::Source. llvm-svn: 188215
*	R600/SI: FMA is faster than fmul and fadd for f64	Niels Ole Salscheider	2013-08-10	2	-0/+19
\| \| \| \|	llvm-svn: 188136
*	R600/SI: Add FMA pattern	Niels Ole Salscheider	2013-08-10	1	-2/+6
\| \| \| \|	llvm-svn: 188135
*	R600/SI: Implement fp32<->fp64 conversions	Niels Ole Salscheider	2013-08-08	2	-2/+9
\| \| \| \|	llvm-svn: 187988
*	R600/SI: Implement sint<->fp64 conversions	Niels Ole Salscheider	2013-08-08	2	-2/+12
\| \| \| \|	llvm-svn: 187987
*	Initialize SIInsertWaits::ExpInstrTypesSeen in the pass constructor.	Evgeniy Stepanov	2013-08-07	1	-1/+2
\| \| \| \| \| \| \|	This value may be used uninitialized in SIInsertWaits::insertWait. Found with MemorySanitizer. llvm-svn: 187869
*	R600: Add new file from r187831 to CMakeLists.txt	Tom Stellard	2013-08-06	1	-0/+1
\| \| \| \|	llvm-svn: 187834
*	R600/SI: Use VSrc_* register classes as the default classes for types	Tom Stellard	2013-08-06	5	-44/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the VSrc_* register classes contain both VGPRs and SGPRs, copies that used be emitted by isel like this: SGPR = COPY VGPR Will now be emitted like this: VSrC = COPY VGPR This patch also adds a pass that tries to identify and fix situations where a VGPR to SGPR copy may occur. Hopefully, these changes will make it impossible for the compiler to generate illegal VGPR to SGPR copies. llvm-svn: 187831
*	R600/SI: Add more special cases for opcodes to ensureSRegLimit()	Tom Stellard	2013-08-06	4	-32/+83
\| \| \| \| \| \|	Also factor out the register class lookup to its own function. llvm-svn: 187830
*	Target/*/CMakeLists.txt: Add the dependency to CommonTableGen explicitly for ↵	NAKAMURA Takumi	2013-08-06	1	-1/+1
\| \| \| \| \| \| \| \| \|	each corresponding CodeGen. Without explicit dependencies, both per-file action and in-CommonTableGen action could run in parallel. It races to emit *.inc files simultaneously. llvm-svn: 187780
*	Factor FlattenCFG out from SimplifyCFG	Tom Stellard	2013-08-06	1	-1/+1
\| \| \| \| \| \|	Patch by: Mei Ye llvm-svn: 187764
*	R600: Implement TargetLowering::getVectorIdxTy()	Tom Stellard	2013-08-05	3	-5/+14
\| \| \| \| \| \| \|	We use MVT::i32 for the vector index type, because we use 32-bit operations to caculate offsets when dynamically indexing vectors. llvm-svn: 187749
*	R600: Add 64-bit float load/store support	Tom Stellard	2013-08-01	8	-23/+139
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Added R600_Reg64 class * Added T#Index#.XY registers definition * Added v2i32 register reads from parameter and global space * Added f32 and i32 elements extraction from v2f32 and v2i32 * Added v2i32 -> v2f32 conversions Tom Stellard: - Mark vec2 operations as expand. The addition of a vec2 register class made them all legal. Patch by: Dmitry Cherkassov Signed-off-by: Dmitry Cherkassov <dcherkassov@gmail.com> llvm-svn: 187582
*	R600: Use 64-bit alignment for 64-bit kernel arguments	Tom Stellard	2013-08-01	1	-1/+1
\| \| \| \|	llvm-svn: 187581
*	R600/SI: Custom lower i64 ZERO_EXTEND	Tom Stellard	2013-08-01	2	-0/+16
\| \| \| \|	llvm-svn: 187580
*	Revert "R600: Non vector only instruction can be scheduled on trans unit"	Tom Stellard	2013-07-31	4	-60/+19
\| \| \| \| \| \|	This reverts commit 98ce62780ea7185ba710868bf83c8077e8d7f6d6. llvm-svn: 187526
*	Revert "R600: Use SchedModel enum for is{Trans,Vector}Only functions"	Tom Stellard	2013-07-31	4	-19/+23
\| \| \| \| \| \|	This reverts commit 3f1de26cb5cc0543a6a1d71259a7a39d97139051. llvm-svn: 187524
*	R600: Do not mergevector after a vector reg is used	Vincent Lejeune	2013-07-31	1	-1/+10
\| \| \| \| \| \| \| \| \| \|	If we merge vector when a vector is used, it will generate an artificial antidependency that can prevent 2 tex/vtx instructions to use the same clause and thus generate extra clauses that reduce performance. There is no test case as such situation is really hard to predict. llvm-svn: 187516
*	R600: Avoid more than 4 literals in the same instruction group at scheduling	Vincent Lejeune	2013-07-31	1	-0/+5
\| \| \| \|	llvm-svn: 187515
*	R600: Non vector only instruction can be scheduled on trans unit	Vincent Lejeune	2013-07-31	4	-19/+60
\| \| \| \|	llvm-svn: 187514
*	R600: Don't mix LDS and non-LDS instructions in the same group	Vincent Lejeune	2013-07-31	1	-0/+4
\| \| \| \| \| \| \| \|	There are a lot of restrictions on instruction groups that contain LDS instructions, so for now we will be conservative and not packetize anything else with them. llvm-svn: 187513
*	R600: Use SchedModel enum for is{Trans,Vector}Only functions	Vincent Lejeune	2013-07-31	4	-23/+19
\| \| \| \|	llvm-svn: 187512
*	R600: Remove predicated_break inst	Vincent Lejeune	2013-07-31	4	-59/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	We were using two instructions for similar purpose : break and predicated break. Only predicated_break was emitted and it was lowered at R600ControlFlowFinalizer to JUMP;CF_BREAK;POP. This commit simplify the situation by making AMDILCFGStructurizer emit IF_PREDICATE;BREAK;ENDIF; instead of predicated_break (which is now removed). There is no functionality change. llvm-svn: 187510
*	R600/SI: Expand vector fp <-> int conversions	Tom Stellard	2013-07-30	2	-4/+4
\| \| \| \|	llvm-svn: 187421
*	[R600] Replicate old DAGCombiner behavior in target specific DAG combine.	Quentin Colombet	2013-07-30	1	-0/+56
\| \| \| \| \| \| \|	build_vector is lowered to REG_SEQUENCE, which is something the register allocator does a good job at optimizing. llvm-svn: 187397
*	SimplifyCFG: Use parallel-and and parallel-or mode to consolidate branch ↵	Tom Stellard	2013-07-27	5	-0/+110
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	conditions Merge consecutive if-regions if they contain identical statements. Both transformations reduce number of branches. The transformation is guarded by a target-hook, and is currently enabled only for +R600, but the correctness has been tested on X86 target using a variety of CPU benchmarks. Patch by: Mei Ye llvm-svn: 187278
*	DAGCombiner: Pass the correct type to TargetLowering::isF(Abs\|Neg)Free	Tom Stellard	2013-07-23	2	-0/+17
\| \| \| \| \| \| \|	This commit also implements these functions for R600 and removes a test case that was relying on the buggy behavior. llvm-svn: 187007
*	R600: Treat CONSTANT_ADDRESS loads like GLOBAL_ADDRESS loads when necessary	Tom Stellard	2013-07-23	2	-19/+7
\| \| \| \| \| \| \| \| \| \|	These are really the same address space in hardware. The only difference is that CONSTANT_ADDRESS uses a special cache for faster access. When we are unable to use the constant kcache for some reason (e.g. smaller types or lack of indirect addressing) then the instruction selector must use GLOBAL_ADDRESS loads instead. llvm-svn: 187006
*	R600: Add support for 24-bit MAD instructions	Tom Stellard	2013-07-23	2	-2/+12
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186923
*	R600: Add support for 24-bit MUL instructions	Tom Stellard	2013-07-23	4	-5/+75
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186922
*	R600: Improve support for < 32-bit loads	Tom Stellard	2013-07-23	4	-11/+39
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186921
*	R600: Rename AMDILISelDAGToDAG.cpp -> AMDGPUISelDAGToDAG.cpp	Tom Stellard	2013-07-23	2	-1/+1
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186920
*	R600: Move CONST_ADDRESS folding into AMDGPUDAGToDAGISel::Select()	Tom Stellard	2013-07-23	4	-49/+160
\| \| \| \| \| \| \| \| \| \|	This increases the number of opportunites we have for folding. With the previous implementation we were unable to fold into any instructions other than the first when multiple instructions were selected from a single SDNode. Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186919
*	R600: Use KCache for kernel arguments	Tom Stellard	2013-07-23	4	-49/+22
\| \| \| \| \|	Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186918
*	R600: Simplify assembly for KCache registers using the TableGen !add operator	Tom Stellard	2013-07-23	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before: MOV * T0.W, KC0[131-128].Y After: MOV * T0.W, KC0[3].Y Reviewed-by: Vincent Lejeune <vljn at ovi.com> llvm-svn: 186917