bcm5719-llvm - Project Ortega BCM5719 LLVM

	Commit message (Collapse)	Author	Age	Files	Lines
*	R600 -> AMDGPU rename	Tom Stellard	2015-06-13	1	-307/+0
\| \| \| \|	llvm-svn: 239657
*	R600: Switch to using generic min / max nodes.	Matt Arsenault	2015-06-09	1	-4/+0
\| \| \| \|	llvm-svn: 239377
*	Add target hook to allow merging stores of nonzero constants	Matt Arsenault	2015-05-24	1	-0/+4
\| \| \| \| \| \| \| \| \| \|	On GPU targets, materializing constants is cheap and stores are expensive, so only doing this for zero vectors was silly. Most of the new testcases aren't optimally merged, and are for later improvements. llvm-svn: 238108
*	R600/SI: Remove explicit m0 operand from v_interp instructions	Tom Stellard	2015-05-12	1	-0/+3
\| \| \| \| \| \| \|	Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237140
*	R600/SI: Remove explicit m0 operand from s_sendmsg	Tom Stellard	2015-05-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead add m0 as an implicit operand. This allows us to avoid using the M0Reg register class and eliminates a number of unnecessary spills when using s_sendmsg instructions. This impacts one shader in the shader-db: SGPRS: 48 -> 40 (-16.67 %) VGPRS: 112 -> 108 (-3.57 %) Code Size: 40132 -> 38796 (-3.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 2048 -> 0 (-100.00 %) bytes per wave llvm-svn: 237133
*	Change getTargetNodeName() to produce compiler warnings for missing cases, ↵	Matthias Braun	2015-05-07	1	-1/+1
\| \| \| \| \| \|	fix them llvm-svn: 236775
*	Reinstate revisions r234755, r234759, r234760	Jan Vesely	2015-04-30	1	-0/+2
\| \| \| \| \| \| \| \| \|	changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240
*	Revert revisions r234755, r234759, r234760	Jan Vesely	2015-04-13	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \|	Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. llvm-svn: 234768
*	R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO	Jan Vesely	2015-04-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: tighten the sub64 tests v3: rename to CARRY/BORROW v4: fixup test cmdline add known bits computation use sign extend instead of sub 0,x better add test v5: remove redundant break move lowering to separate functions fix comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: arsenm llvm-svn: 234759
*	R600: Use new fmad node.	Matt Arsenault	2015-02-20	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \|	This enables a few useful combines that used to only use fma. Also since v_mad_f32 apparently does not support denormals, disable the existing cases that are custom handled if they are requested. llvm-svn: 230071
*	Reuse a bunch of cached subtargets and remove getSubtarget calls	Eric Christopher	2015-01-30	1	-1/+1
\| \| \| \| \| \|	without a Function argument. llvm-svn: 227638
*	R600: Try to use lower types for 64bit division if possible	Jan Vesely	2015-01-22	1	-1/+1
\| \| \| \| \| \| \| \|	v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881
*	R600/SI: Custom lower fround	Matt Arsenault	2015-01-21	1	-0/+4
\| \| \| \| \| \| \| \| \|	This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682
*	R600: Implement getRecipEstimate	Matt Arsenault	2015-01-13	1	-0/+3
\| \| \| \| \| \| \| \| \|	This requires a new hook to prevent expanding sqrt in terms of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are all the same rate, so this expansion would just double the number of instructions and cycles. llvm-svn: 225828
*	R600: Implement getRsqrtEstimate	Matt Arsenault	2015-01-13	1	-0/+5
\| \| \| \| \| \| \| \|	Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827
*	R600: Make cttz / ctlz cheap to speculate	Matt Arsenault	2015-01-13	1	-0/+3
\| \| \| \| \| \| \| \| \|	Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822
*	R600/SI: Add class intrinsic	Matt Arsenault	2015-01-06	1	-0/+1
\| \| \| \|	llvm-svn: 225305
*	R600: Fix min/max matching problems with unordered compares	Matt Arsenault	2014-12-12	1	-8/+8
\| \| \| \| \| \| \| \|	The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094
*	Add target hook for whether it is profitable to reduce load widths	Matt Arsenault	2014-12-12	1	-0/+3
\| \| \| \| \| \| \| \|	Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084
*	R600: Factor i64 UDIVREM lowering into its own fuction	Tom Stellard	2014-11-15	1	-0/+2
\| \| \| \| \| \| \| \|	This is so it could potentially be used by SI. However, the current implementation does not always produce correct results, so the IntegerDivisionPass is being used instead. llvm-svn: 222072
*	R600/SI: Combine min3/max3 instructions	Matt Arsenault	2014-11-14	1	-0/+6
\| \| \| \|	llvm-svn: 222032
*	R600/SI: Match integer min / max instructions	Matt Arsenault	2014-11-14	1	-8/+17
\| \| \| \|	llvm-svn: 222015
*	R600/SI: Fix fmin_legacy / fmax_legacy matching for SI	Matt Arsenault	2014-11-13	1	-3/+10
\| \| \| \| \| \|	select_cc is expanded on SI, so this was never matched. llvm-svn: 221941
*	R600: Remove dead function	Matt Arsenault	2014-10-16	1	-3/+0
\| \| \| \|	llvm-svn: 219879
*	R600/SI: Custom lower f64 -> i64 conversions	Matt Arsenault	2014-10-03	1	-0/+4
\| \| \| \|	llvm-svn: 219038
*	R600: Custom lower [s\|u]int_to_fp for i64 -> f64	Matt Arsenault	2014-10-03	1	-0/+2
\| \| \| \|	llvm-svn: 219037
*	R600: Custom lower frem	Matt Arsenault	2014-09-10	1	-0/+1
\| \| \| \|	llvm-svn: 217553
*	Add override to overriden virtual methods, remove virtual keywords.	Benjamin Kramer	2014-09-03	1	-4/+2
\| \| \| \| \| \|	No functionality change. Changes made by clang-tidy + some manual cleanup. llvm-svn: 217028
*	R600/SI: Use mad for fsub + fmul	Matt Arsenault	2014-08-29	1	-0/+1
\| \| \| \| \| \| \|	We can use a negate source modifier to match this for fsub. llvm-svn: 216735
*	R600/SI: Add intrinsic for ldexp	Matt Arsenault	2014-08-15	1	-0/+1
\| \| \| \|	llvm-svn: 215734
*	Canonicalize header guards into a common format.	Benjamin Kramer	2014-08-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \|	Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
*	R600: Use optimized 24bit path in udivrem	Jan Vesely	2014-08-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	v2: drop enum keyword use correct extension mode don't bother computing the sign in unsinged case Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215462
*	R600: Remove unused code.	Jan Vesely	2014-08-12	1	-6/+0
\| \| \| \| \|	Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215461
*	R600: Use i24 optimized path for SREM	Jan Vesely	2014-08-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	v2: add tests rename LowerSDIV24 to LowerSDIVREM24 handle the rem part in this function Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215460
*	R600: Add new functions for splitting vector loads and stores.	Matt Arsenault	2014-07-24	1	-2/+12
\| \| \| \| \| \|	These will be used in future patches and shouldn't change anything yet. llvm-svn: 213877
*	R600/SI: Use scratch memory for large private arrays	Tom Stellard	2014-07-21	1	-7/+8
\| \| \| \|	llvm-svn: 213551
*	R600/SI: Store constant initializer data in constant memory	Tom Stellard	2014-07-21	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This implements a solution for constant initializers suggested by Vadim Girlin, where we store the data after the shader code and then use the S_GETPC instruction to compute its address. This saves use the trouble of creating a new buffer for constant data and then having to pass the pointer to the kernel via user SGPRs or the input buffer. llvm-svn: 213530
*	R600: Add dag combine for copy of an illegal type.	Matt Arsenault	2014-07-15	1	-0/+1
\| \| \| \| \| \| \| \| \|	This helps avoid redundant instructions to unpack, and repack the vectors. Ideally we could recognize that pattern and eliminate it. Currently v4i8 and other small element type vectors are scalarized, so this has the added bonus of avoiding that. llvm-svn: 213031
*	R600: Move mul combine to separate function	Matt Arsenault	2014-06-30	1	-0/+2
\| \| \| \|	llvm-svn: 212052
*	Silencing a warning about isZExtFree hiding an inherited virtual function. ↵	Aaron Ballman	2014-06-26	1	-0/+1
\| \| \| \| \| \|	No functional change intended. llvm-svn: 211783
*	R600: Fix inconsistency in rsq instructions.	Matt Arsenault	2014-06-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637
*	R600: Remove DIV_INF	Matt Arsenault	2014-06-24	1	-1/+0
\| \| \| \| \| \| \|	This corresponded to an amdil instruction which there is a 2 instruction equivalent for. llvm-svn: 211616
*	R600: Remove AMDILISelLowering	Matt Arsenault	2014-06-23	1	-5/+1
\| \| \| \|	llvm-svn: 211519
*	R600: Use LowerSDIVREM for i64 node replace	Jan Vesely	2014-06-22	1	-1/+1
\| \| \| \| \| \| \| \|	v2: move div/rem node replacement to R600ISelLowering make lowerSDIVREM protected Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211478
*	R600: Implement custom SDIVREM.	Jan Vesely	2014-06-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Instead of separate SDIV/SREM. SDIV used UDIV which in turn used UDIVREM anyway. SREM used SDIV(UDIV->UDIVREM)+MUL+SUB, using UDIVREM directly is more efficient. v2: Don't use all caps names Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211477
*	R600/SI: Add intrinsics for various math instructions.	Matt Arsenault	2014-06-19	1	-0/+12
\| \| \| \| \| \| \| \|	These will be used for custom lowering and for library implementations of various math functions, so it's useful to expose these as builtins. llvm-svn: 211247
*	R600: Handle fnearbyint	Matt Arsenault	2014-06-18	1	-0/+1
\| \| \| \| \| \| \| \|	The difference from rint isn't really relevant here, so treat them as equivalent. OpenCL doesn't have nearbyint, so this is sort of pointless other than for completeness. llvm-svn: 211229
*	R600/SI: Add intrinsics for brev instructions	Matt Arsenault	2014-06-18	1	-0/+1
\| \| \| \|	llvm-svn: 211187
*	R600: Implement f64 ftrunc, ffloor and fceil.	Matt Arsenault	2014-06-18	1	-0/+4
\| \| \| \| \| \|	CI has instructions for these, so this fixes them for older hardware. llvm-svn: 211183
*	R600: Custom lower f64 frint for pre-CI	Matt Arsenault	2014-06-18	1	-0/+1
\| \| \| \|	llvm-svn: 211182