summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/AMDGPUISelLowering.h
Commit message (Collapse)AuthorAgeFilesLines
* R600 -> AMDGPU renameTom Stellard2015-06-131-307/+0
| | | | llvm-svn: 239657
* R600: Switch to using generic min / max nodes.Matt Arsenault2015-06-091-4/+0
| | | | llvm-svn: 239377
* Add target hook to allow merging stores of nonzero constantsMatt Arsenault2015-05-241-0/+4
| | | | | | | | | | On GPU targets, materializing constants is cheap and stores are expensive, so only doing this for zero vectors was silly. Most of the new testcases aren't optimally merged, and are for later improvements. llvm-svn: 238108
* R600/SI: Remove explicit m0 operand from v_interp instructionsTom Stellard2015-05-121-0/+3
| | | | | | | Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237140
* R600/SI: Remove explicit m0 operand from s_sendmsgTom Stellard2015-05-121-0/+1
| | | | | | | | | | | | | | | Instead add m0 as an implicit operand. This allows us to avoid using the M0Reg register class and eliminates a number of unnecessary spills when using s_sendmsg instructions. This impacts one shader in the shader-db: SGPRS: 48 -> 40 (-16.67 %) VGPRS: 112 -> 108 (-3.57 %) Code Size: 40132 -> 38796 (-3.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 2048 -> 0 (-100.00 %) bytes per wave llvm-svn: 237133
* Change getTargetNodeName() to produce compiler warnings for missing cases, ↵Matthias Braun2015-05-071-1/+1
| | | | | | fix them llvm-svn: 236775
* Reinstate revisions r234755, r234759, r234760Jan Vesely2015-04-301-0/+2
| | | | | | | | | changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240
* Revert revisions r234755, r234759, r234760Jan Vesely2015-04-131-2/+0
| | | | | | | | | | | Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. llvm-svn: 234768
* R600: Add carry and borrow instructions. Use them to implement UADDO/USUBOJan Vesely2015-04-131-0/+2
| | | | | | | | | | | | | | | | v2: tighten the sub64 tests v3: rename to CARRY/BORROW v4: fixup test cmdline add known bits computation use sign extend instead of sub 0,x better add test v5: remove redundant break move lowering to separate functions fix comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: arsenm llvm-svn: 234759
* R600: Use new fmad node.Matt Arsenault2015-02-201-1/+0
| | | | | | | | | | | This enables a few useful combines that used to only use fma. Also since v_mad_f32 apparently does not support denormals, disable the existing cases that are custom handled if they are requested. llvm-svn: 230071
* Reuse a bunch of cached subtargets and remove getSubtarget callsEric Christopher2015-01-301-1/+1
| | | | | | without a Function argument. llvm-svn: 227638
* R600: Try to use lower types for 64bit division if possibleJan Vesely2015-01-221-1/+1
| | | | | | | | v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 226881
* R600/SI: Custom lower froundMatt Arsenault2015-01-211-0/+4
| | | | | | | | | This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. llvm-svn: 226682
* R600: Implement getRecipEstimateMatt Arsenault2015-01-131-0/+3
| | | | | | | | | This requires a new hook to prevent expanding sqrt in terms of rsqrt and reciprocal. v_rcp_f32, v_rsq_f32, and v_sqrt_f32 are all the same rate, so this expansion would just double the number of instructions and cycles. llvm-svn: 225828
* R600: Implement getRsqrtEstimateMatt Arsenault2015-01-131-0/+5
| | | | | | | | Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. llvm-svn: 225827
* R600: Make cttz / ctlz cheap to speculateMatt Arsenault2015-01-131-0/+3
| | | | | | | | | Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. llvm-svn: 225822
* R600/SI: Add class intrinsicMatt Arsenault2015-01-061-0/+1
| | | | llvm-svn: 225305
* R600: Fix min/max matching problems with unordered comparesMatt Arsenault2014-12-121-8/+8
| | | | | | | | The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094
* Add target hook for whether it is profitable to reduce load widthsMatt Arsenault2014-12-121-0/+3
| | | | | | | | Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084
* R600: Factor i64 UDIVREM lowering into its own fuctionTom Stellard2014-11-151-0/+2
| | | | | | | | This is so it could potentially be used by SI. However, the current implementation does not always produce correct results, so the IntegerDivisionPass is being used instead. llvm-svn: 222072
* R600/SI: Combine min3/max3 instructionsMatt Arsenault2014-11-141-0/+6
| | | | llvm-svn: 222032
* R600/SI: Match integer min / max instructionsMatt Arsenault2014-11-141-8/+17
| | | | llvm-svn: 222015
* R600/SI: Fix fmin_legacy / fmax_legacy matching for SIMatt Arsenault2014-11-131-3/+10
| | | | | | select_cc is expanded on SI, so this was never matched. llvm-svn: 221941
* R600: Remove dead functionMatt Arsenault2014-10-161-3/+0
| | | | llvm-svn: 219879
* R600/SI: Custom lower f64 -> i64 conversionsMatt Arsenault2014-10-031-0/+4
| | | | llvm-svn: 219038
* R600: Custom lower [s|u]int_to_fp for i64 -> f64Matt Arsenault2014-10-031-0/+2
| | | | llvm-svn: 219037
* R600: Custom lower fremMatt Arsenault2014-09-101-0/+1
| | | | llvm-svn: 217553
* Add override to overriden virtual methods, remove virtual keywords.Benjamin Kramer2014-09-031-4/+2
| | | | | | No functionality change. Changes made by clang-tidy + some manual cleanup. llvm-svn: 217028
* R600/SI: Use mad for fsub + fmulMatt Arsenault2014-08-291-0/+1
| | | | | | | We can use a negate source modifier to match this for fsub. llvm-svn: 216735
* R600/SI: Add intrinsic for ldexpMatt Arsenault2014-08-151-0/+1
| | | | llvm-svn: 215734
* Canonicalize header guards into a common format.Benjamin Kramer2014-08-131-3/+3
| | | | | | | | | | Add header guards to files that were missing guards. Remove #endif comments as they don't seem common in LLVM (we can easily add them back if we decide they're useful) Changes made by clang-tidy with minor tweaks. llvm-svn: 215558
* R600: Use optimized 24bit path in udivremJan Vesely2014-08-121-1/+1
| | | | | | | | | v2: drop enum keyword use correct extension mode don't bother computing the sign in unsinged case Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215462
* R600: Remove unused code.Jan Vesely2014-08-121-6/+0
| | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215461
* R600: Use i24 optimized path for SREMJan Vesely2014-08-121-1/+1
| | | | | | | | | v2: add tests rename LowerSDIV24 to LowerSDIVREM24 handle the rem part in this function Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 215460
* R600: Add new functions for splitting vector loads and stores.Matt Arsenault2014-07-241-2/+12
| | | | | | These will be used in future patches and shouldn't change anything yet. llvm-svn: 213877
* R600/SI: Use scratch memory for large private arraysTom Stellard2014-07-211-7/+8
| | | | llvm-svn: 213551
* R600/SI: Store constant initializer data in constant memoryTom Stellard2014-07-211-2/+4
| | | | | | | | | | | | This implements a solution for constant initializers suggested by Vadim Girlin, where we store the data after the shader code and then use the S_GETPC instruction to compute its address. This saves use the trouble of creating a new buffer for constant data and then having to pass the pointer to the kernel via user SGPRs or the input buffer. llvm-svn: 213530
* R600: Add dag combine for copy of an illegal type.Matt Arsenault2014-07-151-0/+1
| | | | | | | | | This helps avoid redundant instructions to unpack, and repack the vectors. Ideally we could recognize that pattern and eliminate it. Currently v4i8 and other small element type vectors are scalarized, so this has the added bonus of avoiding that. llvm-svn: 213031
* R600: Move mul combine to separate functionMatt Arsenault2014-06-301-0/+2
| | | | llvm-svn: 212052
* Silencing a warning about isZExtFree hiding an inherited virtual function. ↵Aaron Ballman2014-06-261-0/+1
| | | | | | No functional change intended. llvm-svn: 211783
* R600: Fix inconsistency in rsq instructions.Matt Arsenault2014-06-241-0/+2
| | | | | | | | | | | | | R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637
* R600: Remove DIV_INFMatt Arsenault2014-06-241-1/+0
| | | | | | | This corresponded to an amdil instruction which there is a 2 instruction equivalent for. llvm-svn: 211616
* R600: Remove AMDILISelLoweringMatt Arsenault2014-06-231-5/+1
| | | | llvm-svn: 211519
* R600: Use LowerSDIVREM for i64 node replaceJan Vesely2014-06-221-1/+1
| | | | | | | | v2: move div/rem node replacement to R600ISelLowering make lowerSDIVREM protected Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211478
* R600: Implement custom SDIVREM.Jan Vesely2014-06-221-0/+1
| | | | | | | | | | Instead of separate SDIV/SREM. SDIV used UDIV which in turn used UDIVREM anyway. SREM used SDIV(UDIV->UDIVREM)+MUL+SUB, using UDIVREM directly is more efficient. v2: Don't use all caps names Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211477
* R600/SI: Add intrinsics for various math instructions.Matt Arsenault2014-06-191-0/+12
| | | | | | | | These will be used for custom lowering and for library implementations of various math functions, so it's useful to expose these as builtins. llvm-svn: 211247
* R600: Handle fnearbyintMatt Arsenault2014-06-181-0/+1
| | | | | | | | The difference from rint isn't really relevant here, so treat them as equivalent. OpenCL doesn't have nearbyint, so this is sort of pointless other than for completeness. llvm-svn: 211229
* R600/SI: Add intrinsics for brev instructionsMatt Arsenault2014-06-181-0/+1
| | | | llvm-svn: 211187
* R600: Implement f64 ftrunc, ffloor and fceil.Matt Arsenault2014-06-181-0/+4
| | | | | | CI has instructions for these, so this fixes them for older hardware. llvm-svn: 211183
* R600: Custom lower f64 frint for pre-CIMatt Arsenault2014-06-181-0/+1
| | | | llvm-svn: 211182
OpenPOWER on IntegriCloud