bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	R600/SI: Define a schedule model and enable the generic machine scheduler	Tom Stellard	2015-01-29	28	-104/+91
\| \| \| \| \| \|	The schedule model is not complete yet, and could be improved. llvm-svn: 227461
*	R600: Move DataLayout to AMDGPUTargetMachine	Tom Stellard	2015-01-28	15	-24/+24
\| \| \| \| \| \| \| \|	This is a follow up to r227113. It is now required to use the amdgcn target for SI and newer GPUs. llvm-svn: 227316
*	R600/SI: Enable all tests that pass on VI without changes	Marek Olsak	2015-01-27	206	-0/+221
\| \| \| \|	llvm-svn: 227214
*	R600: Cleanup or test	Matt Arsenault	2015-01-26	1	-26/+32
\| \| \| \| \| \| \|	Fix broken check lines, use multiple check prefixes, add an additional test for i1 or. llvm-svn: 227137
*	R600/SI: Emit .hsa.version section for amdhsa OS	Tom Stellard	2015-01-23	1	-0/+2
\| \| \| \|	llvm-svn: 226970
*	R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select()	Tom Stellard	2015-01-23	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \|	We used to do this promotion during DAG legalization, but this caused an infinite loop in ExpandUnalignedLoad() because it assumed that i64 loads were legal if i64 was a legal type. It also seems better to report i64 loads as legal, since they actually are and we were just promoting them to simplify our tablegen files. llvm-svn: 226945
*	R600: Try to use lower types for 64bit division if possible	Jan Vesely	2015-01-22	2	-2/+354
\| \| \| \| \| \| \| \|	v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881
*	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))	Tim Northover	2015-01-21	1	-12/+10
\| \| \| \| \| \| \|	It can help with argument juggling on some targets, and is generally a good idea. llvm-svn: 226740
*	R600: Add checks for urem/srem by a constant	Matt Arsenault	2015-01-21	2	-1/+29
\| \| \| \| \| \| \|	Make sure this uses the faster expansion using magic constants to avoid the full division path. llvm-svn: 226734
*	R600: Add missing tests for i64 srem	Matt Arsenault	2015-01-21	1	-0/+48
\| \| \| \|	llvm-svn: 226713
*	R600/SI: Custom lower fround	Matt Arsenault	2015-01-21	2	-27/+124
\| \| \| \| \| \| \| \| \|	This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682
*	Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))"	Tim Northover	2015-01-21	1	-10/+12
\| \| \| \| \| \| \| \|	It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 llvm-svn: 226665
*	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))	Tim Northover	2015-01-21	1	-12/+10
\| \| \| \|	llvm-svn: 226663
*	R600/SI: Fix simple-loop.ll test	Tom Stellard	2015-01-20	1	-1/+0
\| \| \| \|	llvm-svn: 226596
*	R600/SI: Add kill flag when copying scratch offset to a register	Tom Stellard	2015-01-20	1	-2/+7
\| \| \| \| \| \| \|	This allows us to re-use the same register for the scratch offset when accessing large private arrays. llvm-svn: 226585
*	R600/SI: Don't store scratch buffer frame index in MUBUF offset field	Tom Stellard	2015-01-20	1	-0/+81
\| \| \| \| \| \| \| \|	We don't have a good way of legalizing this if the frame index offset is more than the 12-bits, which is size of MUBUF's offset field, so now we store the frame index in the vaddr field. llvm-svn: 226584
*	R600: Remove redundant test	Matt Arsenault	2015-01-18	3	-15/+2
\| \| \| \| \| \|	This is already covered in ftrunc.ll llvm-svn: 226412
*	R600: Clean up floor tests	Matt Arsenault	2015-01-16	4	-151/+146
\| \| \| \| \| \| \| \|	These were using different naming schemes, not using multiple check prefixes and not using -LABEL. llvm-svn: 226333
*	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32	Matt Arsenault	2015-01-15	2	-0/+155
\| \| \| \|	llvm-svn: 226230
*	R600/SI: Fix trailing comma with modifiers	Matt Arsenault	2015-01-15	1	-1/+14
\| \| \| \| \| \| \|	Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. llvm-svn: 226226
*	R600/SI: Improve fpext / fptrunc test coverage	Matt Arsenault	2015-01-15	2	-8/+78
\| \| \| \|	llvm-svn: 226197
*	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI	Marek Olsak	2015-01-15	2	-36/+33
\| \| \| \|	llvm-svn: 226190
*	R600/SI: Remove some redudant load testcases.	Matt Arsenault	2015-01-14	4	-76/+9
\| \| \| \| \| \| \|	This reduces coverage for Evergreen, since the more complete tests have those run lines disabled. llvm-svn: 225927
*	R600/SI: Fix bad code with unaligned byte vector loads	Matt Arsenault	2015-01-14	1	-17/+7
\| \| \| \| \| \| \| \| \|	Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. llvm-svn: 225926
*	Implement new way of expanding extloads.	Matt Arsenault	2015-01-14	6	-17/+1415
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925
*	R600: Implement getRsqrtEstimate	Matt Arsenault	2015-01-13	2	-1/+40
\| \| \| \| \| \| \| \|	Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827
*	R600: Make cttz / ctlz cheap to speculate	Matt Arsenault	2015-01-13	1	-0/+224
\| \| \| \| \| \| \| \| \|	Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822
*	Combine fcmp + select to fminnum / fmaxnum if no nans and legal	Matt Arsenault	2015-01-13	2	-11/+27
\| \| \| \| \| \| \|	Also require unsafe FP math for no since there isn't a way to test for signed zeros. llvm-svn: 225744
*	R600/SI: Use RegisterOperands to specify which operands can accept immediates	Tom Stellard	2015-01-12	5	-18/+18
\| \| \| \| \| \| \| \| \| \| \| \|	There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. llvm-svn: 225662
*	R600/SI: Remove SIISelLowering::legalizeOperands()	Tom Stellard	2015-01-08	10	-12/+33
\| \| \| \| \| \| \| \| \|	Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445
*	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases.	Matthias Braun	2015-01-07	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! llvm-svn: 225415
*	R600/SI: Commute instructions to enable more folding opportunities	Tom Stellard	2015-01-07	3	-4/+4
\| \| \| \|	llvm-svn: 225410
*	R600/SI: Only fold immediates that have one use	Tom Stellard	2015-01-07	1	-0/+35
\| \| \| \| \| \| \|	Folding the same immediate into multiple instruction will increase program size, which can hurt performance. llvm-svn: 225405
*	R600/SI: Add a V_MOV_B64 pseudo instruction	Tom Stellard	2015-01-07	3	-30/+19
\| \| \| \| \| \| \|	This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. llvm-svn: 225373
*	R600/SI: Teach SIFoldOperands to split 64-bit constants when folding	Tom Stellard	2015-01-07	3	-8/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 llvm-svn: 225369
*	R600/SI: Add combine for isinfinite pattern	Matt Arsenault	2015-01-06	1	-0/+85
\| \| \| \|	llvm-svn: 225310
*	R600/SI: Pattern match isinf to v_cmp_class instructions	Matt Arsenault	2015-01-06	1	-0/+45
\| \| \| \|	llvm-svn: 225307
*	R600/SI: Add basic DAG combines for fp_class	Matt Arsenault	2015-01-06	1	-0/+162
\| \| \| \|	llvm-svn: 225306
*	R600/SI: Add class intrinsic	Matt Arsenault	2015-01-06	1	-0/+335
\| \| \| \|	llvm-svn: 225305
*	R600/SI: Insert s_waitcnt before s_barrier instructions.	Tom Stellard	2015-01-06	2	-0/+5
\| \| \| \| \| \| \|	This ensures that all memory operations are complete when all threads reach the barrier. llvm-svn: 225290
*	R600/SI: Add a stub GCNTargetMachine	Tom Stellard	2015-01-06	277	-308/+308
\| \| \| \| \| \| \| \| \| \| \| \|	This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. llvm-svn: 225277
*	Enable (sext x) == C --> x == (trunc C) combine	Matt Arsenault	2014-12-21	2	-7/+394
\| \| \| \| \| \| \| \| \|	Extend the existing code which handles this for zext. This makes this more useful for targets with ZeroOrNegativeOne BooleanContent and obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne) since the constant will now be shrunk to i1. llvm-svn: 224691
*	R600/SI: Only form min/max with 1 use.	Matt Arsenault	2014-12-19	3	-0/+69
\| \| \| \| \| \| \|	If the condition is used for something else, this increases the number of instructions. llvm-svn: 224646
*	R600/SI: Make sure non-inline constants aren't folded into mubuf soffset operand	Tom Stellard	2014-12-19	1	-0/+39
\| \| \| \| \| \| \| \|	mubuf instructions now define the soffset field using the SCSrc_32 register class which indicates that only SGPRs and inline constants are allowed. llvm-svn: 224622
*	R600/SI: Fix f64 inline immediates	Matt Arsenault	2014-12-17	1	-0/+272
\| \| \| \|	llvm-svn: 224458
*	IR: Make metadata typeless in assembly	Duncan P. N. Exon Smith	2014-12-15	15	-34/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257
*	R600: Fix min/max matching problems with unordered compares	Matt Arsenault	2014-12-12	4	-4/+148
\| \| \| \| \| \| \| \|	The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094
*	R600/SI: Don't promote f32 select to i32	Matt Arsenault	2014-12-12	4	-4/+4
\| \| \| \| \| \| \| \|	This is nice for the instruction patterns, but it complicates min / max matching. The select doesn't have the correct type and would require looking through the bitcasts for the real float operands. llvm-svn: 224092
*	Add target hook for whether it is profitable to reduce load widths	Matt Arsenault	2014-12-12	2	-15/+213
\| \| \| \| \| \| \| \|	Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084
*	R600/SI: Use unordered equal instructions	Matt Arsenault	2014-12-11	3	-10/+5
\| \| \| \|	llvm-svn: 224067