summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/R600/SIISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
* R600 -> AMDGPU renameTom Stellard2015-06-131-2241/+0
| | | | llvm-svn: 239657
* R600: Switch to using generic min / max nodes.Matt Arsenault2015-06-091-8/+12
| | | | llvm-svn: 239377
* R600/SI: Reimplement isLegalAddressingModeMatt Arsenault2015-06-041-30/+66
| | | | | | | | | | | Now that we sometimes know the address space, this can theoretically do a better job. This needs better test coverage, but this mostly depends on first updating the loop optimizatiosn to provide the address space. llvm-svn: 239053
* R600/SI: Fix some cases for load / store of halfMatt Arsenault2015-06-041-3/+29
| | | | | | | Mostly argument loads were producing broken zextloads from an FP type. llvm-svn: 239049
* R600/SI: Don't hardcode pointer typeMatt Arsenault2015-06-011-4/+5
| | | | llvm-svn: 238789
* Add address space argument to isLegalAddressingModeMatt Arsenault2015-06-011-1/+1
| | | | | | | | | | This is important because of different addressing modes depending on the address space for GPU targets. This only adds the argument, and does not update any of the uses to provide the correct address space. llvm-svn: 238723
* R600/SI: Remove explicit m0 operand from v_interp instructionsTom Stellard2015-05-121-1/+22
| | | | | | | Instead add m0 as an implicit operand. This helps avoid spills of the m0 register in some cases. llvm-svn: 237140
* R600/SI: Remove explicit m0 operand from s_sendmsgTom Stellard2015-05-121-1/+24
| | | | | | | | | | | | | | | Instead add m0 as an implicit operand. This allows us to avoid using the M0Reg register class and eliminates a number of unnecessary spills when using s_sendmsg instructions. This impacts one shader in the shader-db: SGPRS: 48 -> 40 (-16.67 %) VGPRS: 112 -> 108 (-3.57 %) Code Size: 40132 -> 38796 (-3.33 %) bytes LDS: 0 -> 0 (0.00 %) blocks Scratch: 2048 -> 0 (-100.00 %) bytes per wave llvm-svn: 237133
* Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"Sergey Dmitrouk2015-04-281-42/+48
| | | | | | | | | | | | | | | | | | | | | | | | | [DebugInfo] Add debug locations to constant SD nodes This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235989
* Revert "[DebugInfo] Add debug locations to constant SD nodes"Daniel Jasper2015-04-281-48/+42
| | | | | | | This breaks a test: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870 llvm-svn: 235987
* [DebugInfo] Add debug locations to constant SD nodesSergey Dmitrouk2015-04-281-42/+48
| | | | | | | | | | | | | | | | | | | | | | | This adds debug location to constant nodes of Selection DAG and updates all places that create constants to pass debug locations (see PR13269). Can't guarantee that all locations are correct, but in a lot of cases choice is obvious, so most of them should be. At least all tests pass. Tests for these changes do not cover everything, instead just check it for SDNodes, ARM and AArch64 where it's easy to get incorrect locations on constants. This is not complete fix as FastISel contains workaround for wrong debug locations, which drops locations from instructions on processing constants, but there isn't currently a way to use debug locations from constants there as llvm::Constant doesn't cache it (yet). Although this is a bit different issue, not directly related to these changes. Differential Revision: http://reviews.llvm.org/D9084 llvm-svn: 235977
* R600: Make FMIN/MAXNUM legal on all asicsJan Vesely2015-04-121-2/+0
| | | | | | | | v2: Add tests Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: arsenm llvm-svn: 234716
* R600/SI: Initial support for assembler and inline assemblyTom Stellard2015-04-081-0/+35
| | | | | | | | | | | | | This is currently considered experimental, but most of the more commonly used instructions should work. So far only SI has been extensively tested, CI and VI probably work too, but may be buggy. The current set of tests cases do not give complete coverage, but I think it is sufficient for an experimental assembler. See the documentation in R600Usage for more information. llvm-svn: 234381
* R600/SI: Use V_FRACT_F64 for faster 64-bit floor on SIMarek Olsak2015-03-241-1/+1
| | | | | | | Other f64 opcodes not supported on SI can be lowered in a similar way. v2: use complex VOP3 patterns llvm-svn: 233076
* R600/SI: Expand fract to floor, then only select V_FRACT on CIMarek Olsak2015-03-241-0/+6
| | | | | | | | | V_FRACT is buggy on SI. R600-specific code is left intact. v2: drop the multiclass, use complex VOP3 patterns llvm-svn: 233075
* R600/SI: don't try min3/max3/med3 with f64Tom Stellard2015-03-161-0/+1
| | | | | | | | | | There are no opcodes for this. This also adds a test case. v2: make test more robust Patch by: Grigori Goronzy llvm-svn: 232386
* Remove the need to cache the subtarget in the R600 TargetRegisterInfoEric Christopher2015-03-111-4/+30
| | | | | | classes. llvm-svn: 231954
* Make constant arrays that are passed to functions as const.Benjamin Kramer2015-03-071-7/+3
| | | | | | | | In theory this allows the compiler to skip materializing the array on the stack. In practice clang often fails to do that, but that's a different story. NFC. llvm-svn: 231571
* Remove an argument-less call to getSubtargetImpl from TargetLoweringBase.Eric Christopher2015-02-261-1/+1
| | | | | | | | | This required plumbing a TargetRegisterInfo through computeRegisterProperties and into findRepresentativeClass which uses it for register class iteration. This required passing a subtarget into a few target specific initializations of TargetLowering. llvm-svn: 230583
* R600/SI: Remove isel mubuf legalizationTom Stellard2015-02-241-124/+0
| | | | | | | We legalize mubuf instructions post-instruction selection, so this code is no longer needed. llvm-svn: 230352
* CodeGen: convert CCState interface to using ArrayRefsTim Northover2015-02-211-2/+2
| | | | | | | | | | | Everyone except R600 was manually passing the length of a static array at each callsite, calculated in a variety of interesting ways. Far easier to let ArrayRef handle that. There should be no functional change, but out of tree targets may have to tweak their calls as with these examples. llvm-svn: 230118
* R600/SI: Remove v_sub_f64 pseudoMatt Arsenault2015-02-201-13/+2
| | | | | | | | | | The expansion code does the same thing. Since the operands were not defined with the correct types, this has the side effect of fixing operand folding since the expanded pseudo would never use SGPRs or inline immediates. llvm-svn: 230072
* R600: Use new fmad node.Matt Arsenault2015-02-201-29/+20
| | | | | | | | | | | This enables a few useful combines that used to only use fma. Also since v_mad_f32 apparently does not support denormals, disable the existing cases that are custom handled if they are requested. llvm-svn: 230071
* Prefer SmallVector::append/insert over push_back loops.Benjamin Kramer2015-02-171-14/+6
| | | | | | Same functionality, but hoists the vector growth out of the loop. llvm-svn: 229500
* AArch64: Safely handle the incoming sret call argument.Andrew Trick2015-02-161-3/+3
| | | | | | | | | | | | | | | | | | This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413
* R600/SI: Implement correct f64 fdivMatt Arsenault2015-02-141-1/+65
| | | | | | This version passes the OpenCL conformance test. llvm-svn: 229239
* R600/SI: Allow f64 inline immediates in i64 operandsMatt Arsenault2015-02-131-4/+2
| | | | | | | This requires considering the size of the operand when checking immediate legality. llvm-svn: 229135
* R600/SI: Add soffset operand to mubuf addr64 instructionTom Stellard2015-02-111-0/+1
| | | | | | We were previously hard-coding soffset to 0. llvm-svn: 228775
* R600/SI: Expand misaligned 16-bit memory accessesTom Stellard2015-02-041-0/+5
| | | | llvm-svn: 228190
* R600/SI: Make more store operations legalTom Stellard2015-02-041-9/+0
| | | | | | | | | | | v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. llvm-svn: 228189
* R600/SI: 64-bit and larger memory access must be at least 4-byte alignedTom Stellard2015-02-021-4/+4
| | | | | | | | This is true for SI only. CI+ supports unaligned memory accesses, but this requires driver support, so for now we disallow unaligned accesses for all GCN targets. llvm-svn: 227822
* Reuse a bunch of cached subtargets and remove getSubtarget callsEric Christopher2015-01-301-34/+30
| | | | | | without a Function argument. llvm-svn: 227638
* R600/SI: Implement enableAggressiveFMAFusionMatt Arsenault2015-01-291-1/+30
| | | | | | | | | Add tests for the various combines. This should always be at least cycle neutral on all subtargets for f64, and faster on some. For f32 we should prefer selecting v_mad_f32 over v_fma_f32. llvm-svn: 227484
* R600/SI: Add subtarget feature for if f32 fma is fastMatt Arsenault2015-01-291-1/+1
| | | | llvm-svn: 227483
* R600/SI: Add subtarget feature to enable VGPR spilling for all shader typesTom Stellard2015-01-201-0/+6
| | | | | | | This is disabled by default, but can be enabled with the subtarget feature: 'vgpr-spilling' llvm-svn: 226597
* R600/SI: Fix bad code with unaligned byte vector loadsMatt Arsenault2015-01-141-3/+16
| | | | | | | | | Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. llvm-svn: 225926
* Implement new way of expanding extloads.Matt Arsenault2015-01-141-6/+9
| | | | | | | | | | | | | | | Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. llvm-svn: 225925
* R600/SI: Add pattern for bitcasting fp immediates to integersTom Stellard2015-01-131-5/+5
| | | | | | | | The backend now assumes that all immediates are integers. This allows us to simplify immediate handling code, becasue we no longer need to handle fp and integer immediates differently. llvm-svn: 225844
* R600/SI: Remove redundant setting expand on f64 vectorsMatt Arsenault2015-01-121-7/+0
| | | | | | | None of these are legal types already, so they default to Expand. llvm-svn: 225728
* R600/SI: Use RegisterOperands to specify which operands can accept immediatesTom Stellard2015-01-121-4/+2
| | | | | | | | | | | | There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. llvm-svn: 225662
* R600/SI: Remove SIISelLowering::legalizeOperands()Tom Stellard2015-01-081-173/+1
| | | | | | | | | Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. llvm-svn: 225445
* [SelectionDAG] Allow targets to specify legality of extloads' resultAhmed Bougacha2015-01-081-17/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | type (in addition to the memory type). The *LoadExt* legalization handling used to only have one type, the memory type. This forced users to assume that as long as the extload for the memory type was declared legal, and the result type was legal, the whole extload was legal. However, this isn't always the case. For instance, on X86, with AVX, this is legal: v4i32 load, zext from v4i8 but this isn't: v4i64 load, zext from v4i8 Whereas v4i64 is (arguably) legal, even without AVX2. Note that the same thing was done a while ago for truncstores (r46140), but I assume no one needed it yet for extloads, so here we go. Calls to getLoadExtAction were changed to add the value type, found manually in the surrounding code. Calls to setLoadExtAction were mechanically changed, by wrapping the call in a loop, to match previous behavior. The loop iterates over the MVT subrange corresponding to the memory type (FP vectors, etc...). I also pulled neighboring setTruncStoreActions into some of the loops; those shouldn't make a difference, as the additional types are illegal. (e.g., i128->i1 truncstores on PPC.) No functional change intended. Differential Revision: http://reviews.llvm.org/D6532 llvm-svn: 225421
* R600/SI: Remove VReg_32 register classTom Stellard2015-01-071-6/+6
| | | | | | | | | | | Use VGPR_32 register class instead. These two register classes were identical and having separate classes was causing SIInstrInfo::isLegalOperands() to be overly conservative in some cases. This change is necessary to prevent future paches from missing a folding opportunity in fneg-fabs.ll. llvm-svn: 225382
* R600/SI: Add combine for isinfinite patternMatt Arsenault2015-01-061-0/+56
| | | | llvm-svn: 225310
* R600/SI: Pattern match isinf to v_cmp_class instructionsMatt Arsenault2015-01-061-0/+33
| | | | llvm-svn: 225307
* R600/SI: Add basic DAG combines for fp_classMatt Arsenault2015-01-061-1/+48
| | | | llvm-svn: 225306
* Enable (sext x) == C --> x == (trunc C) combineMatt Arsenault2014-12-211-21/+2
| | | | | | | | | Extend the existing code which handles this for zext. This makes this more useful for targets with ZeroOrNegativeOne BooleanContent and obsoletes a custom combine SI uses for i1 setcc (sext(i1), 0, setne) since the constant will now be shrunk to i1. llvm-svn: 224691
* R600/SI: Fix f64 inline immediatesMatt Arsenault2014-12-171-22/+21
| | | | llvm-svn: 224458
* R600/SI: Don't promote f32 select to i32Matt Arsenault2014-12-121-2/+0
| | | | | | | | This is nice for the instruction patterns, but it complicates min / max matching. The select doesn't have the correct type and would require looking through the bitcasts for the real float operands. llvm-svn: 224092
* R600/SI: Use unordered equal instructionsMatt Arsenault2014-12-111-4/+0
| | | | llvm-svn: 224067
OpenPOWER on IntegriCloud