summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
Commit message (Collapse)AuthorAgeFilesLines
* AMDGPU: R600 code splitting cleanupMatt Arsenault2016-03-111-2/+2
| | | | | | | Move a few functions only used by R600 to R600 specific code, fix header macros to stop using R600, mark classes as final. llvm-svn: 263204
* AMDGPU: Insert two S_NOP instructions for every high level source statement.Tom Stellard2016-03-031-0/+11
| | | | | | | | | | | | | | Patch by: Konstantin Zhuravlyov Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input. Reviewers: tstellarAMD, arsenm Subscribers: echristo, dblaikie, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17454 llvm-svn: 262579
* AMDGPU/SI: Detect uniform branches and emit s_cbranch instructionsTom Stellard2016-02-121-5/+5
| | | | | | | | | | Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 llvm-svn: 260765
* AMDGPU: Initialize SILowerControlFlowMatt Arsenault2016-02-121-1/+2
| | | | llvm-svn: 260645
* AMDGPU: Fix ordering of CPU and FS parameters in TargetMachine constructorsTom Stellard2016-02-051-4/+4
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16863 llvm-svn: 259897
* AMDGPU/SI: Correctly initialize SIInsertWaits passTom Stellard2016-02-051-1/+2
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16724 llvm-svn: 259894
* AMDGPU: Skip promote alloca with no optimizationsMatt Arsenault2016-02-021-1/+1
| | | | llvm-svn: 259551
* AMDGPU: Fix emitting invalid workitem intrinsics for HSAMatt Arsenault2016-01-301-2/+4
| | | | | | | | | | | | | | | | | | The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. llvm-svn: 259297
* AMDGPU: Fix default device handlingMatt Arsenault2016-01-271-2/+16
| | | | | | | | | | | | | | | | | | | When no device name is specified, default to kaveri for HSA since SI is not supported and it woud fail. Default to "tahiti" instead of "SI" since these are effectively the same, and tahiti is an actual device. Move default device handling to the TargetMachine rather than the AMDGPUSubtarget. The module ISA version is computed from the device name provided with the target machine, so the attributes printed by the AsmPrinter were inconsistent with those computed in the subtarget. Also remove DevName field from subtarget since it's redundant with getCPU() in the superclass. llvm-svn: 258901
* AMDGPU/SI: Pass whether to use the SI scheduler via Target AttributeTom Stellard2016-01-211-0/+2
| | | | | | | | | | | | | | | | Summary: Currently the SI scheduler can be selected via command line option, but it turned out it would be better if it was selectable via a Target Attribute. This patch adds "si-scheduler" attribute to the backend. Reviewers: tstellarAMD, echristo Subscribers: echristo, arsenm Differential Revision: http://reviews.llvm.org/D16192 llvm-svn: 258386
* Correctly initialize SIAnnotateControlFlowTom Stellard2016-01-201-0/+1
| | | | | | | | | | Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16304 llvm-svn: 258319
* AMDGPU/SI: Add SI Machine SchedulerNicolai Haehnle2016-01-131-2/+6
| | | | | | | | | | | | | | | | Summary: It is off by default, but can be used with --misched=si Patch by: Axel Davy Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D11885 llvm-svn: 257609
* AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF ↵Tom Stellard2015-12-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | instructions Summary: We were previously selecting all constant loads to SMRD instructions and legalizing the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass. This new solution is more simple and also generates much better code, because the instruction selector is able to take advantage of all the MUBUF addressing modes that are legalization pass wasn't able to. We also no longer need to generate v_add_* instructions when we have a uniform pointer and a non-uniform offset, as this is now folded into the MUBUF instruction during instruction selection. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15425 llvm-svn: 255672
* AMDGPU/SI: Emit constant arrays in the .text sectionTom Stellard2015-12-101-2/+2
| | | | | | | | | | | | | | | Summary: This allows us to remove the END_OF_TEXT_LABEL hack we had been using and simplifies the fixups used to compute the address of constant arrays. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15257 llvm-svn: 255204
* AMDGPU: Remove SIPrepareScratchRegsMatt Arsenault2015-11-301-1/+0
| | | | | | | | | | | | | | | | | | | | | | It does not work because of emergency stack slots. This pass was supposed to eliminate dummy registers for the spill instructions, but the register scavenger can introduce more during PrologEpilogInserter, so some would end up left behind if they were needed. The potential for spilling the scratch resource descriptor and offset register makes doing something like this overly complicated. Reserve registers to use for the resource descriptor and use them directly in eliminateFrameIndex. Also removes creating another scratch resource descriptor when directly selecting scratch MUBUF instructions. The choice of which registers are reserved is temporary. For now it attempts to pick the next available registers after the user and system SGPRs. llvm-svn: 254329
* AMDGPU: Add pass to detect used kernel featuresMatt Arsenault2015-11-061-0/+8
| | | | | | | | | | | Mark kernels that use certain features that require user SGPRs to support with kernel attributes. We need to know before instruction selection begins because it impacts the kernel calling convention lowering. For now this only detects the workitem intrinsics. llvm-svn: 252323
* AMDGPU: Initialize SIFixSGPRCopies so -print-after worksMatt Arsenault2015-11-031-1/+2
| | | | llvm-svn: 251995
* AMDGPU: Register some more passes so -print-before worksMatt Arsenault2015-10-121-0/+2
| | | | llvm-svn: 250071
* CodeGen: print and verify after TargetPassConfig::insertPass by defaultJustin Bogner2015-10-081-1/+3
| | | | | | | | | | | | | | In r224059, we started verifying after addPass, but missed doing so on insertPass. There isn't a good reason for the discrepancy, and skipping the verifier in these cases causes bugs. This also exposes a verifier error that was introduced in r249087, but the verifier doesn't run until after the register coalescer, when the issue happens to have been resolved. I've skipped the verifier after SIFixSGPRLiveRangesID to avoid the failures for now and will follow up with Matt for a proper fix. llvm-svn: 249643
* AMDGPU: Properly register passesMatt Arsenault2015-10-071-2/+2
| | | | llvm-svn: 249495
* AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc passMatt Arsenault2015-10-011-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace LiveInterval usage with LiveVariables. LiveIntervals computes far more information than is needed for this pass which just needs to find if an SGPR is live out of the defining block. LiveIntervals are not usually available that early, requiring computing them twice which is very expensive. The extra run of LiveIntervals/LiveVariables/SlotIndexes was costing in total about 5% of compile time. Continuing to use LiveIntervals is problematic. It seems there is an option (early-live-intervals) to run the analysis about where it should go to avoid recomputing LiveVariables, but it seems to be completely broken with subreg liveness enabled. There are also problems from trying to recompute LiveIntervals since this seems to undo LiveVariables and clearing kill flags, causing TwoAddressInstructions to make bad decisions. Insert the pass right after live variables and preserve it. The tricky case to worry about might be phis since LiveVariables doesn't count a register as live out if in the successor block it is only used in a phi, but I don't think this is a concern right now because SIFixSGPRCopies replaces SGPR phis. llvm-svn: 249087
* AMDGPU/SI: Use .hsatext section instead of .text for HSATom Stellard2015-09-251-4/+10
| | | | | | | | | | Reviewers: arsenm, grosbach, rafael Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12424 llvm-svn: 248619
* AMDGPU: Disable some passes that are not meaningfulMatt Arsenault2015-09-251-3/+15
| | | | | | | | | | | Don't run passes related to stack maps, garbage collection, exceptions since these aren't useful for GPUs. There might be a few more to turn off that I'm less sure about (e.g. ShrinkWrapping) or I'm not sure how to disable (SafeStack and StackProtector) llvm-svn: 248591
* constify the Function parameter to the TTI creation callback andEric Christopher2015-09-161-1/+1
| | | | | | propagate to all callers/users/etc. llvm-svn: 247864
* AMDGPU: Make sure to run verifier after SIFixSGPRLiveRangesMatt Arsenault2015-08-221-1/+1
| | | | llvm-svn: 245769
* AMDGPU: Add pass to lower OpenCL image and sampler arguments.Tom Stellard2015-08-071-0/+2
| | | | | | | | | The pass adds new kernel arguments for image attributes, and resolves calls to dummy attribute and resource id getter functions. Patch by: Zoltan Gilian llvm-svn: 244372
* AMDGPU/SI: Fix read2 merging into a super register.Matt Arsenault2015-07-141-0/+1
| | | | | | | | | | | | | | | | If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. llvm-svn: 242174
* Make TargetTransformInfo keeping a reference to the Module DataLayoutMehdi Amini2015-07-091-2/+4
| | | | | | | | | | | | | | | | | | | | DataLayout is no longer optional. It was initialized with or without a DataLayout, and the DataLayout when supplied could have been the one from the TargetMachine. Summary: This change is part of a series of commits dedicated to have a single DataLayout during compilation by using always the one owned by the module. Reviewers: echristo Subscribers: jholewinski, llvm-commits, rafael, yaron.keren Differential Revision: http://reviews.llvm.org/D11021 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 241774
* AMDGPU: Run SIInsertWaits as pre-emit passMatt Arsenault2015-07-061-1/+1
| | | | | | | | | | | Running this after the scheduler enables scheduling waits later so other ALU instructions can run while this would be waiting. When combined with enabling the post-RA scheduler, this gives about a ~20% improvement on sgemm. llvm-svn: 241473
* R600 -> AMDGPU renameTom Stellard2015-06-131-0/+292
| | | | llvm-svn: 239657
* Revert "AMDGPU: Add core backend files for R600/SI codegen v6"Tom Stellard2012-07-161-162/+0
| | | | | | This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea. llvm-svn: 160303
* AMDGPU: Add core backend files for R600/SI codegen v6Tom Stellard2012-07-161-0/+162
llvm-svn: 160270
OpenPOWER on IntegriCloud