summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/R600ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* AMDGPU: Refactor exp instructionsMatt Arsenault2016-12-051-3/+3
| | | | | | | | | | | | | | | Structure the definitions a bit more like the other classes. The main change here is to split EXP with the done bit set to a separate opcode, so we can set mayLoad = 1 so that it won't be reordered before the other exp stores, since this has the special constraint that if the done bit is set then this should be the last exp in she shader. Previously all exp instructions were inferred to have unmodeled side effects. llvm-svn: 288695
* AMDGPU: Refactor kernel argument loweringTom Stellard2016-09-161-3/+5
| | | | | | | | | | | | | | | | | | | Summary: The main challenge in lowering kernel arguments for AMDGPU is determing the memory type of the argument. The generic calling convention code assumes that only legal register types can be stored in memory, but this is not the case for AMDGPU. This consolidates all the logic AMDGPU uses for deducing memory types into a single function. This will make it much easier to support different ABIs in the future. Reviewers: arsenm Subscribers: arsenm, wdng, nhaehnle, llvm-commits, yaxunl Differential Revision: https://reviews.llvm.org/D24614 llvm-svn: 281781
* [CodeGen] Split out the notions of MI invariance and MI dereferenceability.Justin Lebar2016-09-111-2/+3
| | | | | | | | | | | | | | | | | | | Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 llvm-svn: 281151
* AMDGPU/R600: EXTRACT_VECT_ELT should only bypass BUILD_VECTOR if the vectors ↵Jan Vesely2016-09-021-1/+3
| | | | | | | | | | have the same number of elements. Fixes R600 piglit regressions since r280298 Differential Revision: https://reviews.llvm.org/D24174 llvm-svn: 280535
* AMDGPU/R600: Expand unaligned writes to local and global ASJan Vesely2016-09-021-2/+11
| | | | | | | | | LOCAL and GLOBAL AS only PRIVATE needs special treatment Differential Revision: https://reviews.llvm.org/D23971 llvm-svn: 280526
* AMDGPU/R600: Cleanup DAGCombineJan Vesely2016-08-291-15/+12
| | | | | | | | | Move SDLoc initialization to comon place. fall back to AMDGPU version in one place Differential Revision: https://reviews.llvm.org/D23900 llvm-svn: 280030
* AMDGPU/R600: Remove MergeVectorStores from legalizationJan Vesely2016-08-291-3/+0
| | | | | | | | This is handled by DAGCombiner in a more generic way Differential Revision: https://reviews.llvm.org/D23970 llvm-svn: 280019
* AMDGPU/R600: Enable Load combineJan Vesely2016-08-271-0/+1
| | | | | | | | Fix and improve tests Differential Revision: https://reviews.llvm.org/D23899 llvm-svn: 279925
* Replace "fallthrough" comments with LLVM_FALLTHROUGHJustin Bogner2016-08-171-3/+4
| | | | | | | This is a mechanical change of comments in switches like fallthrough, fall-through, or fall-thru to use the LLVM_FALLTHROUGH macro instead. llvm-svn: 278902
* AMDGPU/R600: Remove macrosMatt Arsenault2016-08-131-2/+2
| | | | llvm-svn: 278588
* Fix more dereferenced end() iterators after r278532Hans Wennborg2016-08-131-0/+2
| | | | llvm-svn: 278587
* AMDGPU/R600: Remove dead custom insertersMatt Arsenault2016-07-261-209/+1
| | | | | | The intrinsics for these were removed, so this is dead. llvm-svn: 276805
* AMDGPU: Make AMDGPUMachineFunction fields privateMatt Arsenault2016-07-261-2/+2
| | | | | | | | | ABIArgOffset is a problem because properly fsetting the KernArgSize requires that the reserved area before the real kernel arguments be correctly aligned, which requires fixing clover. llvm-svn: 276766
* AMDGPU: Remove read_workdim intrinsicJan Vesely2016-07-251-6/+0
| | | | | | Differential revision: https://reviews.llvm.org/D22732 llvm-svn: 276682
* AMDGPU: Delete more dead codeMatt Arsenault2016-07-221-39/+1
| | | | | | | Remove dead code from r600 intrinsic removal. Remove unset members, rename StackSize to be less ambiguous. llvm-svn: 276436
* AMDGPU: Fix i1 fp_to_intMatt Arsenault2016-07-221-5/+20
| | | | | | | R600's i1 fp_to_uint selected but was incorrect according to what instcombine constant folds to. llvm-svn: 276435
* AMDGPU: Fix TargetPrefix for remaining r600 intrinsicsMatt Arsenault2016-07-151-1/+1
| | | | llvm-svn: 275619
* AMDGPU: Remove legacy rsq.clamped intrinsicMatt Arsenault2016-07-151-7/+5
| | | | | | | | Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining. Also fix mismatch with non-IEEE rsq selecting to IEEE rsq. llvm-svn: 275617
* [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, ↵Justin Lebar2016-07-151-13/+9
| | | | | | | | | | | | | | | | | | | | | | | getStore, and friends. Summary: Instead, we take a single flags arg (a bitset). Also add a default 0 alignment, and change the order of arguments so the alignment comes before the flags. This greatly simplifies many callsites, and fixes a bug in AMDGPUISelLowering, wherein the order of the args to getLoad was inverted. It also greatly simplifies the process of adding another flag to getLoad. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits Differential Revision: http://reviews.llvm.org/D22249 llvm-svn: 275592
* AMDGPU/R600: Delete/rename intrinsics no longer used by mesaMatt Arsenault2016-07-141-1/+1
| | | | | | Use the replacement pass to update the tests, and delete old names. llvm-svn: 275375
* AMDGPU/R600: Remove intrinsics with no tests and no usersMatt Arsenault2016-07-141-23/+1
| | | | | | Mesa removed this path, so nothing is using these anymore. llvm-svn: 275372
* AMDGPU/R600: Add implicitarg.ptr intrinsicJan Vesely2016-07-101-0/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D21622 llvm-svn: 275024
* CodeGen: Use MachineInstr& in TargetLowering, NFCDuncan P. N. Exon Smith2016-06-301-197/+200
| | | | | | | | | | | | | This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr*), since the argument is expected to be a valid MachineInstr. In one case, changed a parameter from MachineInstr* to MachineBasicBlock::iterator, since it was used as an insertion point. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. llvm-svn: 274287
* CodeGen: Use MachineInstr& in TargetInstrInfo, NFCDuncan P. N. Exon Smith2016-06-301-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `*` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr*` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189
* AMDGPU: Cleanup subtarget handling.Matt Arsenault2016-06-241-18/+19
| | | | | | | | | Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. llvm-svn: 273652
* Pass DebugLoc and SDLoc by const ref.Benjamin Kramer2016-06-121-11/+7
| | | | | | | | This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512
* AMDGPU: Cleanup load testsMatt Arsenault2016-06-021-0/+14
| | | | | | | | | There are a lot of different kinds of loads to test for, and these were scattered around inconsistently with some redundancy. Try to comprehensively test all loads in a consistent way. llvm-svn: 271571
* AMDGPU: Cleanup lowering actionsMatt Arsenault2016-05-211-43/+38
| | | | | | | | These are kind of a mess and hard to follow, particularly for loads and stores. Fix various redundant, unnecessary and dead settings. llvm-svn: 270307
* AMDGPU/R600: Use correct number of vector elements when lowering private loadsJan Vesely2016-05-161-5/+3
| | | | | | | | | | Reviewer: tstellardAMD, arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D20032 llvm-svn: 269725
* AMDGPU/R600: Fold global address operandJan Vesely2016-05-131-0/+7
| | | | | | | | | | Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19793 llvm-svn: 269480
* AMDGPU/R600: Implement memory loads from constant ASJan Vesely2016-05-131-51/+12
| | | | | | | | | | Reviewers: tstellard Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D19792 llvm-svn: 269479
* AMDGPU: Move R600 specific code out of AMDGPUISelLowering.cppTom Stellard2016-05-021-0/+49
| | | | | | | | | | Reviewers: arsenm Subscribers: jvesely, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19736 llvm-svn: 268267
* [CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI.Ahmed Bougacha2016-04-261-11/+10
| | | | | | Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606
* AMDGPU: Remove custom load/store scalarizationMatt Arsenault2016-04-141-1/+1
| | | | llvm-svn: 266385
* AMDGPU: Add a shader calling conventionNicolai Haehnle2016-04-061-1/+1
| | | | | | | | | | | This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 llvm-svn: 265589
* [DAG] use isUndef() ; NFCISanjay Patel2016-03-141-4/+4
| | | | llvm-svn: 263448
* AMDGPU: R600 code splitting cleanupMatt Arsenault2016-03-111-0/+14
| | | | | | | Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
* AMDGPU: Move function only used by R600Matt Arsenault2016-03-071-0/+16
| | | | llvm-svn: 262853
* AMDGPU/R600: Implement allowsMisalignedMemoryAccessMatt Arsenault2016-02-221-0/+20
| | | | | | | | This avoids some test regressions in a future commit when unaligned operations are expanded when they have custom lowering. llvm-svn: 261570
* AMDGPU: Rename intrinsic to better match instruction nameMatt Arsenault2016-02-131-1/+1
| | | | | | Also fixes missing f32 test. llvm-svn: 260780
* AMDGPU: Split R600 and SI store loweringMatt Arsenault2016-02-111-13/+68
| | | | | | | These were only sharing some somewhat incorrect logic for when to scalarize or split vectors. llvm-svn: 260490
* AMDGPU: Split R600 and SI load loweringMatt Arsenault2016-02-101-8/+70
| | | | | | | These weren't actually sharing anything in the common LowerLOAD. llvm-svn: 260398
* [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.Ahmed Bougacha2016-02-091-11/+5
| | | | llvm-svn: 260316
* AMDGPU: Restore AMDGPU prefixed rsq intrinsic for nowMatt Arsenault2016-01-261-1/+2
| | | | | | Also move into backend intrinsics to discourage use of the old name. llvm-svn: 258783
* AMDGPU: Remove more unused intrinsicsMatt Arsenault2016-01-231-6/+0
| | | | | | Replace tests with lrp with basic IR expansion llvm-svn: 258612
* AMDGPU: Rename intrinsics to use amdgcn prefixMatt Arsenault2016-01-221-2/+8
| | | | | | | | | | | The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557
* AMDGPU: Rename some r600 intrinsics to use correct TargetPrefixMatt Arsenault2016-01-221-20/+20
| | | | | | These ones aren't directly emitted by mesa and inserted by a pass. llvm-svn: 258523
* AMDGPU: Remove unused R600 intrinsicsMatt Arsenault2016-01-221-44/+0
| | | | llvm-svn: 258522
* AMDGPU: Remove AMDGPU.fract intrinsicMatt Arsenault2016-01-221-3/+0
| | | | | | | Mesa doesn't use this, and this is pattern matched already from fsub x, (ffloor x) llvm-svn: 258513
* AMDGPU: Remove AMDIL.fraction intrinsicMatt Arsenault2016-01-201-1/+0
| | | | llvm-svn: 258347
OpenPOWER on IntegriCloud