summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/R600ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* R600 -> AMDGPU renameTom Stellard2015-06-131-2286/+0
| | | | llvm-svn: 239657
* R600: Rely on TypeLegalizer to use divrem instead of div/remJan Vesely2015-05-271-43/+0
| | | | | reviewer: tstellardAMD llvm-svn: 238337
* R600: Use SIGN_EXTEND_INREG for SEXT loadsJan Vesely2015-05-261-6/+3
| | | | | | Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> llvm-svn: 238229
* Reinstate revisions r234755, r234759, r234760Jan Vesely2015-04-301-2/+28
| | | | | | | | | changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-96/+106
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-106/+96
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-281-96/+106
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* Revert revisions r234755, r234759, r234760Jan Vesely2015-04-131-29/+2
| | | | | | | | | | | Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. llvm-svn: 234768
* R600: Add carry and borrow instructions. Use them to implement UADDO/USUBOJan Vesely2015-04-131-2/+29
| | | | | | | | | | | | | | | | v2: tighten the sub64 tests v3: rename to CARRY/BORROW v4: fixup test cmdline add known bits computation use sign extend instead of sub 0,x better add test v5: remove redundant break move lowering to separate functions fix comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: arsenm llvm-svn: 234759
* Reduce dyn_cast<> to isa<> or cast<> where possible.Benjamin Kramer2015-04-101-2/+2
| | | | | | No functional change intended. llvm-svn: 234586
* Fix typo in comment.Nico Weber2015-03-251-1/+1
| | | | llvm-svn: 233226
* R600/SI: Expand fract to floor, then only select V_FRACT on CIMarek Olsak2015-03-241-0/+4
| | | | | | | | | V_FRACT is buggy on SI. R600-specific code is left intact. v2: drop the multiclass, use complex VOP3 patterns llvm-svn: 233075
* DataLayout is mandatory, update the API to reflect it with references.Mehdi Amini2015-03-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Now that the DataLayout is a mandatory part of the module, let's start cleaning the codebase. This patch is a first attempt at doing that. This patch is not exactly NFC as for instance some places were passing a nullptr instead of the DataLayout, possibly just because there was a default value on the DataLayout argument to many functions in the API. Even though it is not purely NFC, there is no change in the validation. I turned as many pointer to DataLayout to references, this helped figuring out all the places where a nullptr could come up. I had initially a local version of this patch broken into over 30 independant, commits but some later commit were cleaning the API and touching part of the code modified in the previous commits, so it seemed cleaner without the intermediate state. Test Plan: Reviewers: echristo Subscribers: llvm-commits From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 231740
* Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.Eric Christopher2015-02-261-1/+1
| | | | | | | | | This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583
* Prefer SmallVector::append/insert over push_back loops.Benjamin Kramer2015-02-171-7/+2
| | | | | | Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500
* AArch64: Safely handle the incoming sret call argument.Andrew Trick2015-02-161-1/+1
| | | | | | | | | | | | | | | | | | This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413
* Reuse a bunch of cached subtargets and remove getSubtarget callsEric Christopher2015-01-301-11/+10
| | | | | | without a Function argument. llvm-svn: 227638
* [SelectionDAG] Allow targets to specify legality of extloads' resultAhmed Bougacha2015-01-081-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | type (in addition to the memory type). The *LoadExt* legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421
* R600: Fix min/max matching problems with unordered comparesMatt Arsenault2014-12-121-0/+7
| | | | | | | | The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094
* R600/SI: Use ZeroOrNegativeOneBooleanContentMatt Arsenault2014-11-261-2/+0
| | | | | | | | | | | | | | This sort of doesn't matter since the setcc type is i1, but this previously was using the default UndefinedBooleanContent. This makes it more consistent with R600. This enables more optimizations which typically give up on UndefinedBooleanContent. For example, there is already a special case target DAG combine for setcc + sext which can be eliminated in favor of what the generic DAG combiner can do if it assumes boolean values are sign extended. Since -1 is an inline immediate, using it is basically free and the backend already uses it when a boolean value is needed in a wider type. llvm-svn: 222850
* R600: Fix extloads of i1 on R600/EvergreenMatt Arsenault2014-11-231-0/+5
| | | | llvm-svn: 222631
* R600: Factor i64 UDIVREM lowering into its own fuctionTom Stellard2014-11-151-68/+1
| | | | | | | | This is so it could potentially be used by SI. However, the current implementation does not always produce correct results, so the IntegerDivisionPass is being used instead. llvm-svn: 222072
* Reapply "R600: Add new intrinsic to read work dimensions"Jan Vesely2014-10-141-3/+8
| | | | | | | | | This effectively reverts revert 219707. After fixing the test to work with new function name format and renamed intrinsic. Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219710
* Revert "R600: Add new intrinsic to read work dimensions"Rafael Espindola2014-10-141-8/+3
| | | | | | | | This reverts commit r219705. CodeGen/R600/work-item-intrinsics.ll was failing on linux. llvm-svn: 219707
* R600: Add new intrinsic to read work dimensionsJan Vesely2014-10-141-3/+8
| | | | | | | | | | | | | | v2: Add SI lowering Add test v3: Place work dimensions after the kernel arguments. v4: Calculate offset while lowering arguments v5: rebase v6: change prefix to AMDGPU Reviewed-by: Tom Stellard <tom@stellard.net> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 219705
* R600: Add cmpxchg instruction for evergreenAaron Watry2014-09-111-1/+4
| | | | | | | | | | | | | | | | | Refactored the R600_LDS_1A2D class a bit to get it to actually work. It seemed to be previously unused and broken. We also have to disable the conversion to the noret variant for now in R600ISelLowering because the getLDSNoRetOp method only handles 1A1D LDS ops. Someone can feel free to modify the AMDGPU::getLDSNoRetOp method to work for more than 1A1D variants of LDS operations. It's being left as a future TODO for now. Signed-off-by: Aaron Watry <awatry at gmail.com> Reviewed-by: Matt Arsenault <matthew.arsenault@amd.com> llvm-svn: 217596
* R600: Correctly set the src value offset for scalarized kernel argsMatt Arsenault2014-08-131-11/+29
| | | | | | | | | | This for some reason fixes v1i64 kernel arguments on pre-SI. This currently breaks some other cases in the kernel-args.ll test for R600, but I'm not particularly confident in the new output. VTX_READ_* are not used for some of the scalarized cases, and the code reading from the constant buffer doesn't make much sense to me. llvm-svn: 215564
* Remove the target machine from CCState. Previously it was only usedEric Christopher2014-08-061-2/+2
| | | | | | | | | to get the subtarget and that's accessible from the MachineFunction now. This helps clear the way for smaller changes where we getting a subtarget will require passing in a MachineFunction/Function as well. llvm-svn: 214988
* Have MachineFunction cache a pointer to the subtarget to make lookupsEric Christopher2014-08-051-7/+7
| | | | | | | | | | | shorter/easier and have the DAG use that to do the same lookup. This can be used in the future for TargetMachine based caching lookups from the MachineFunction easily. Update the MIPS subtarget switching machinery to update this pointer at the same time it runs. llvm-svn: 214838
* Remove the TargetMachine forwards for TargetSubtargetInfo basedEric Christopher2014-08-041-12/+12
| | | | | | information and update all callers. No functional change. llvm-svn: 214781
* Revert "R600: Move code for generating REGISTER_LOAD into R600ISelLowering.cpp"Tom Stellard2014-08-011-41/+0
| | | | | | | | This reverts commit r214566. I did not mean to commit this yet. llvm-svn: 214572
* R600: Move code for generating REGISTER_LOAD into R600ISelLowering.cppTom Stellard2014-08-011-0/+41
| | | | | | | SI doesn't use REGISTER_LOAD anymore, but it was still hitting this code path for 8-bit and 16-bit private loads. llvm-svn: 214566
* Make sure no loads resulting from load->switch DAGCombine are marked invariantLouis Gerbarg2014-07-311-1/+2
| | | | | | | | | | | | | | Currently when DAGCombine converts loads feeding a switch into a switch of addresses feeding a load the new load inherits the isInvariant flag of the left side. This is incorrect since invariant loads can be reordered in cases where it is illegal to reoarder normal loads. This patch adds an isInvariant parameter to getExtLoad() and updates all call sites to pass in the data if they have it or false if they don't. It also changes the DAGCombine to use that data to make the right decision when creating the new load. llvm-svn: 214449
* R600: Add new functions for splitting vector loads and stores.Matt Arsenault2014-07-241-1/+1
| | | | | | These will be used in future patches and shouldn't change anything yet. llvm-svn: 213877
* R600/SI: Store constant initializer data in constant memoryTom Stellard2014-07-211-0/+14
| | | | | | | | | | | | This implements a solution for constant initializers suggested by Vadim Girlin, where we store the data after the shader code and then use the S_GETPC instruction to compute its address. This saves use the trouble of creating a new buffer for constant data and then having to pass the pointer to the kernel via user SGPRs or the input buffer. llvm-svn: 213530
* R600: Make ShaderType privateMatt Arsenault2014-07-131-1/+1
| | | | llvm-svn: 212896
* R600: Implement float to long/ulongJan Vesely2014-07-101-1/+15
| | | | | | | | | | | | | | Use alg. from LegalizeDAG.cpp Move Expand setting to SIISellowering v2: Extend existing tests instead of creating new ones v3: use separate LowerFPTOSINT function v4: use TargetLowering::expandFP_TO_SINT add comment about using FP_TO_SINT for uints Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> llvm-svn: 212773
* R600: Fix mishandling of load / store chains.Matt Arsenault2014-07-071-1/+8
| | | | | | | | Fixes various bugs with reordering loads and stores. Scalarized vector loads weren't collecting the chains at all. llvm-svn: 212473
* Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to ↵Craig Topper2014-06-291-3/+2
| | | | | | simplify some code. llvm-svn: 211993
* R600: Move load/store ReplaceNodeResults to common code.Matt Arsenault2014-06-271-14/+0
| | | | | | Future patches will want to custom lower loads on SI. llvm-svn: 211848
* R600: Fix inconsistency in rsq instructions.Matt Arsenault2014-06-241-0/+3
| | | | | | | | | | | | | R600 was using a clamped version of rsq, but SI was not. Add a new rsq_clamped intrinsic and use them consistently. It's unclear to me from the documentation what behavior the R600 instructions have, so I assume they have the legacy behavior described by the SI documents. For R600, use RECIPSQRT_IEEE for both llvm.AMDGPU.rsq.legacy and llvm.AMDGPU.rsq. R600 also has RECIPSQRT_FF, which I'm not sure how it fits in here. llvm-svn: 211637
* R600: Remove AMDILISelLoweringMatt Arsenault2014-06-231-0/+11
| | | | llvm-svn: 211519
* R600: Move add/sub with overflow out of AMDILISelLoweringMatt Arsenault2014-06-231-0/+8
| | | | | | Add more tests for these. llvm-svn: 211517
* R600/SI: Handle i64 sub.Matt Arsenault2014-06-231-0/+2
| | | | | | We can handle it the same way as add llvm-svn: 211514
* R600: Rename AMDIL fileMatt Arsenault2014-06-231-1/+1
| | | | llvm-svn: 211512
* R600: Use LowerSDIVREM for i64 node replaceJan Vesely2014-06-221-1/+119
| | | | | | | | v2: move div/rem node replacement to R600ISelLowering make lowerSDIVREM protected Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211478
* R600: Implement 64bit SRAJan Vesely2014-06-181-5/+7
| | | | | | | v2: Use capitalized variable name Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211159
* R600: Implement 64bit SRLJan Vesely2014-06-181-0/+40
| | | | | | | v2: use C++ style comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211158
* R600: Implement 64bit SHLJan Vesely2014-06-181-0/+41
| | | | | | | v2: Use c++ style comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 211157
* R600: Use LDS and vectors for private memoryTom Stellard2014-06-171-0/+62
| | | | llvm-svn: 211110
OpenPOWER on IntegriCloud