summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600
Commit message (Collapse)AuthorAgeFilesLines
...
* R600/SI: Remove explicit m0 operand from s_sendmsgTom Stellard2015-05-126-8/+36
| | | | | | | | | | | | | | | Instead add m0 as an implicit operand. This allows us to avoid using the M0Reg register class and eliminates a number of unnecessary spills when using s_sendmsg instructions. This impacts one shader in the shader-db: SGPRS: 48 -> 40 (-16.67 %) VGPRS: 112 -> 108 (-3.57 %) Code Size: 40132 -> 38796 (-3.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 2048 -> 0 (-100.00 %) bytes per wave llvm-svn: 237133
* R600/SI: Replace TRI->getRegClass(Reg) with TRI->getPhysRegClass(Reg)Tom Stellard2015-05-123-7/+11
| | | | | | | TRI->getRegClass() takes a register class ID, not a register. We were using this incorrectly in a few places. llvm-svn: 237132
* MachineCSE: Add a target query for the LookAheadLimit heurisiticTom Stellard2015-05-091-0/+2
| | | | | | | | | This is used to determine whether or not to CSE physical register defs. Differential Revision: http://reviews.llvm.org/D9472 llvm-svn: 236923
* Change getTargetNodeName() to produce compiler warnings for missing cases, ↵Matthias Braun2015-05-072-3/+11
| | | | | | fix them llvm-svn: 236775
* R600: Fix comment that mentions AMDILMatt Arsenault2015-05-071-2/+2
| | | | llvm-svn: 236745
* [X86] Disable loop unrolling in loop vectorization pass when VF is 1.Wei Mi2015-05-062-2/+2
| | | | | | | | | | | | | The patch disabled unrolling in loop vectorization pass when VF==1 on x86 architecture, by setting MaxInterleaveFactor to 1. Unrolling in loop vectorization pass may introduce the cost of overflow check, memory boundary check and extra prologue/epilogue code when regular unroller will unroll the loop another time. Disable it when VF==1 remove the unnecessary cost on x86. The same can be done for other platforms after verifying interleaving/memory bound checking to be not perf critical on those platforms. Differential Revision: http://reviews.llvm.org/D9515 llvm-svn: 236613
* [ShrinkWrap] Add (a simplified version) of shrink-wrapping.Quentin Colombet2015-05-052-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new pass that computes the safe point to insert the prologue and epilogue of the function. The interest is to find safe points that are cheaper than the entry and exits blocks. As an example and to avoid regressions to be introduce, this patch also implements the required bits to enable the shrink-wrapping pass for AArch64. ** Context ** Currently we insert the prologue and epilogue of the method/function in the entry and exits blocks. Although this is correct, we can do a better job when those are not immediately required and insert them at less frequently executed places. The job of the shrink-wrapping pass is to identify such places. ** Motivating example ** Let us consider the following function that perform a call only in one branch of a if: define i32 @f(i32 %a, i32 %b) { %tmp = alloca i32, align 4 %tmp2 = icmp slt i32 %a, %b br i1 %tmp2, label %true, label %false true: store i32 %a, i32* %tmp, align 4 %tmp4 = call i32 @doSomething(i32 0, i32* %tmp) br label %false false: %tmp.0 = phi i32 [ %tmp4, %true ], [ %a, %0 ] ret i32 %tmp.0 } On AArch64 this code generates (removing the cfi directives to ease readabilities): _f: ; @f ; BB#0: stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething LBB0_2: ; %false mov sp, x29 ldp x29, x30, [sp], #16 ret With shrink-wrapping we could generate: _f: ; @f ; BB#0: cmp w0, w1 b.ge LBB0_2 ; BB#1: ; %true stp x29, x30, [sp, #-16]! mov x29, sp sub sp, sp, #16 ; =16 stur w0, [x29, #-4] sub x1, x29, #4 ; =4 mov w0, wzr bl _doSomething add sp, x29, #16 ; =16 ldp x29, x30, [sp], #16 LBB0_2: ; %false ret Therefore, we would pay the overhead of setting up/destroying the frame only if we actually do the call. ** Proposed Solution ** This patch introduces a new machine pass that perform the shrink-wrapping analysis (See the comments at the beginning of ShrinkWrap.cpp for more details). It then stores the safe save and restore point into the MachineFrameInfo attached to the MachineFunction. This information is then used by the PrologEpilogInserter (PEI) to place the related code at the right place. This pass runs right before the PEI. Unlike the original paper of Chow from PLDI’88, this implementation of shrink-wrapping does not use expensive data-flow analysis and does not need hack to properly avoid frequently executed point. Instead, it relies on dominance and loop properties. The pass is off by default and each target can opt-in by setting the EnableShrinkWrap boolean to true in their derived class of TargetPassConfig. This setting can also be overwritten on the command line by using -enable-shrink-wrap. Before you try out the pass for your target, make sure you properly fix your emitProlog/emitEpilog/adjustForXXX method to cope with basic blocks that are not necessarily the entry block. ** Design Decisions ** 1. ShrinkWrap is its own pass right now. It could frankly be merged into PEI but for debugging and clarity I thought it was best to have its own file. 2. Right now, we only support one save point and one restore point. At some point we can expand this to several save point and restore point, the impacted component would then be: - The pass itself: New algorithm needed. - MachineFrameInfo: Hold a list or set of Save/Restore point instead of one pointer. - PEI: Should loop over the save point and restore point. Anyhow, at least for this first iteration, I do not believe this is interesting to support the complex cases. We should revisit that when we motivating examples. Differential Revision: http://reviews.llvm.org/D9210 <rdar://problem/3201744> llvm-svn: 236507
* R600/SI: Code cleanupTom Stellard2015-05-041-3/+2
| | | | | | This is a follow-up to r236004 llvm-svn: 236427
* R600/SI: Add VCC as an implict def of SI_KILLTom Stellard2015-05-011-3/+6
| | | | | | When SI_KILL has a register operand, its lowered form writes to vcc. llvm-svn: 236307
* R600/SI: Fix verifier errors from the SIAnnotateControlFlow passTom Stellard2015-05-011-1/+9
| | | | | | | | | | | | | This pass was generating 'Instruction does not dominate all uses!' errors for programs which had loops with a condition variable that depended on the result of a phi instruction from outside of the loop. The pass was inserting new phi nodes outside of the loop which used values defined inside the loop. http://bugs.freedesktop.org/show_bug.cgi?id=90056 llvm-svn: 236306
* Reinstate revisions r234755, r234759, r234760Jan Vesely2015-04-307-2/+60
| | | | | | | | | changes: Don't apply on hexagon and NVPTX since they no longer claim to support UADDO/USUBO Add location to getConstant Drop comment about the ops being turned into expand llvm-svn: 236240
* R600: Fix up for AsmPrinter's OutStreamer being a unique_ptrTom Stellard2015-04-281-2/+3
| | | | | | | | | | | Fixes a crash with basically any OpenGL application using the radeonsi driver. Patch by: Michel Dänzer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90176 Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 236004
* R600/SI: Add a lower case alias for subtarget feature: +DumpCodeTom Stellard2015-04-281-0/+5
| | | | | | | llc converts all feature strings to lower case, while the LLVM C API does not, so we need a lower case alias in order to test this with llc. llvm-svn: 236003
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-287-343/+386
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-287-386/+343
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-287-343/+386
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* [AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr.Lang Hames2015-04-242-97/+99
| | | | | | | AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a reference for this is crufty. llvm-svn: 235752
* R600/SI: Fix verifier error when producing v_madmk_f32Matt Arsenault2015-04-241-0/+3
| | | | | | Copy the kill flags when swapping the operands. llvm-svn: 235687
* R600/RegisterCoalescer: Enable more rematerialization/add missing testcaseMatthias Braun2015-04-241-2/+2
| | | | | | | This enables the rematerialization of some R600 MOV instructions in the RegisterCoalescer and adds a testcase for r235668. llvm-svn: 235675
* R600/SI: Special case v_mov_b32 as really rematerializableMatt Arsenault2015-04-232-0/+17
| | | | | | | This should be fixed to properly understand all rematerializable instructions while ignoring implicit reads of exec. llvm-svn: 235671
* R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operandsTom Stellard2015-04-231-4/+2
| | | | llvm-svn: 235662
* R600/SI: Fix indirect addressing with a negative constant offsetTom Stellard2015-04-231-16/+55
| | | | | | | | | | | When the base register index of the vector plus the constant offset was less than zero, we were passing the wrong base register to the indirect addressing instruction. In this case, we need to set the base register to v0 and then add the computed (negative) index to m0. llvm-svn: 235641
* R600/SI: Add assembler support for all CI and VI VOP1 instructionsTom Stellard2015-04-236-11/+71
| | | | llvm-svn: 235629
* R600/SI: v_mov_fed_b32 does not exist on VITom Stellard2015-04-231-1/+1
| | | | llvm-svn: 235628
* R600/SI: Use a better error message for unsupported instructions in the ↵Tom Stellard2015-04-231-1/+1
| | | | | | assembler llvm-svn: 235627
* R600/SI: Improve AsmParser support for forced e64 encodingTom Stellard2015-04-231-5/+45
| | | | | | | We can now force e64 encoding even when the operands would be legal for e32 encoding. llvm-svn: 235626
* R600: Fix always inline pass breaking noinline functionsMatt Arsenault2015-04-221-2/+3
| | | | | | No test since calls are not actually supported yet. llvm-svn: 235524
* [mc] Clean up emission of byte sequencesBenjamin Kramer2015-04-171-2/+1
| | | | | | No functional change intended. llvm-svn: 235178
* Use raw_pwrite_stream in the object writer/streamer.Rafael Espindola2015-04-143-4/+5
| | | | | | The ELF object writer will take advantage of that in the next commit. llvm-svn: 234950
* R600/SI: Fix verifier error caused by SIAnnotateControlFlowTom Stellard2015-04-141-6/+13
| | | | | | | | | | | | | | This pass will always try to insert llvm.SI.ifbreak intrinsics in the same block that its conditional value is computed in. This is a problem when conditions for breaks or continue are computed outside of the loop, because the llvm.SI.ifbreak intrinsic ends up being inserted outside of the loop. This patch fixes this problem by inserting the llvm.SI.ifbreak intrinsics in the loop header when the condition is computed outside the loop. llvm-svn: 234891
* Revert revisions r234755, r234759, r234760Jan Vesely2015-04-137-61/+2
| | | | | | | | | | | Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. llvm-svn: 234768
* R600: Add carry and borrow instructions. Use them to implement UADDO/USUBOJan Vesely2015-04-137-2/+61
| | | | | | | | | | | | | | | | v2: tighten the sub64 tests v3: rename to CARRY/BORROW v4: fixup test cmdline add known bits computation use sign extend instead of sub 0,x better add test v5: remove redundant break move lowering to separate functions fix comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: arsenm llvm-svn: 234759
* R600: Make FMIN/MAXNUM legal on all asicsJan Vesely2015-04-123-2/+7
| | | | | | | | v2: Add tests Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: arsenm llvm-svn: 234716
* R600: remove manual BFE optimizationJan Vesely2015-04-121-8/+2
| | | | | | | | Fixed since r233079 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: arsenm llvm-svn: 234715
* Use 'override/final' instead of 'virtual' for overridden methodsAlexander Kornienko2015-04-111-1/+1
| | | | | | | | | | | | | | The patch is generated using clang-tidy misc-use-override check. This command was used: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \ -checks='-*,misc-use-override' -header-filter='llvm|clang' \ -j=32 -fix -format http://reviews.llvm.org/D8925 llvm-svn: 234679
* Reduce dyn_cast<> to isa<> or cast<> where possible.Benjamin Kramer2015-04-104-7/+7
| | | | | | No functional change intended. llvm-svn: 234586
* R600/SI: Add some missing overridesTom Stellard2015-04-082-2/+2
| | | | llvm-svn: 234384
* R600/SI: Initial support for assembler and inline assemblyTom Stellard2015-04-0814-133/+1369
| | | | | | | | | | | | | This is currently considered experimental, but most of the more commonly used instructions should work. So far only SI has been extensively tested, CI and VI probably work too, but may be buggy. The current set of tests cases do not give complete coverage, but I think it is sufficient for an experimental assembler. See the documentation in R600Usage for more information. llvm-svn: 234381
* R600/SI: Add missing SOPK instructionsTom Stellard2015-04-083-13/+72
| | | | llvm-svn: 234380
* R600/SI: Don't print offset0/offset1 DS operands when they are 0Tom Stellard2015-04-081-4/+8
| | | | llvm-svn: 234379
* Replace the MCSubtargetInfo parameter with a Triple when creatingEric Christopher2015-03-311-3/+3
| | | | | | | an MCInstPrinter. Update all callers and use where we wanted a Triple previously. llvm-svn: 233648
* Remove unused Target argument from MCInstPrinter ctor functions.Eric Christopher2015-03-301-2/+1
| | | | llvm-svn: 233607
* CodeGen: Use the new DebugLoc API, NFCDuncan P. N. Exon Smith2015-03-301-1/+1
| | | | | | Update lib/CodeGen (and lib/Target) to use the new `DebugLoc` API. llvm-svn: 233582
* Remove more superfluous .str() and replace std::string concatenation with Twine.Yaron Keren2015-03-301-1/+1
| | | | | | Following r233392, http://llvm.org/viewvc/llvm-project?rev=233392&view=rev. llvm-svn: 233555
* [MCInstPrinter] Enable MCInstPrinter to change its behavior based on theAkira Hatanaka2015-03-273-3/+5
| | | | | | | | | | | | | | | | | | | | per-function subtarget. Currently, code-gen passes the default or generic subtarget to the constructors of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which enables some targets (AArch64, ARM, and X86) to change their instprinter's behavior based on the subtarget feature bits. Since the backend can now use different subtargets for each function, instprinter has to be changed to use the per-function subtarget rather than the default subtarget. This patch takes the first step towards enabling instprinter to change its behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the various print methods table-gen auto-generates. I will follow up with changes to instprinters of AArch64, ARM, and X86. llvm-svn: 233411
* R600/SI: Fix VOP2 VI encodingMarek Olsak2015-03-271-1/+1
| | | | | | Broken by "R600/SI: Refactor VOP2 instruction defs". llvm-svn: 233399
* Remove superfluous .str() and replace std::string concatenation with Twine.Yaron Keren2015-03-271-2/+2
| | | | llvm-svn: 233392
* Fix typo in comment.Nico Weber2015-03-251-1/+1
| | | | llvm-svn: 233226
* Opaque Pointer Types: GEP API migrations to specify the gep type explicitlyDavid Blaikie2015-03-241-3/+3
| | | | | | | | | | | | | | | | | | | | | The changes to InstCombine (& SCEV) do seem a bit silly - it doesn't make anything obviously better to have the caller access the pointers element type (the thing I'm trying to remove) than the GEP itself, but it's a helpful migration step. This will allow me to more obviously lock down GEP (& Load, etc) API usage, then fix all the code that accesses pointer element types except the places that need to be removed (most of the InstCombines) anyway - at which point I'll need to just remove all that code because it won't be meaningful anymore (there will be no pointer types, so no bitcasts to combine) SCEV looks like it'll need some restructuring - we'll have to do a bit more work for GEP canonicalization, since it'll depend on how it's used if we can even manage to canonicalize it to a non-ugly GEP. I guess we can do some fun stuff like voting (do 2 out of 3 load from the GEP with a certain type that gives a pretty GEP? Does every typed use of the GEP use either a specific type or a generic type (i8*, etc)?) llvm-svn: 233131
* R600/SI: Insert more NOPs after READLANE on VI, don't use NOPs on CIMarek Olsak2015-03-241-1/+16
| | | | | | This is a candidate for stable. llvm-svn: 233080
OpenPOWER on IntegriCloud