summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/AMDGPU/GlobalISel
Commit message (Collapse)AuthorAgeFilesLines
* Revert "[AMDGPU] Invert the handling of skip insertion."Nicolai Hähnle2020-02-031-4/+7
| | | | | | | | | This reverts commit 0dc6c249bffac9f23a605ce4e42a84341da3ddbd. The commit is reported to cause a regression in piglit/bin/glsl-vs-loop for Mesa. (cherry picked from commit a80291ce10ba9667352adcc895f9668144f5f616)
* [AMDGPU] Invert the handling of skip insertion.cdevadas2020-01-151-7/+4
| | | | | | | | | | | | | | | The current implementation of skip insertion (SIInsertSkip) makes it a mandatory pass required for correctness. Initially, the idea was to have an optional pass. This patch inserts the s_cbranch_execz upfront during SILowerControlFlow to skip over the sections of code when no lanes are active. Later, SIRemoveShortExecBranches removes the skips for short branches, unless there is a sideeffect and the skip branch is really necessary. This new pass will replace the handling of skip insertion in the existing SIInsertSkip Pass. Differential revision: https://reviews.llvm.org/D68092
* AMDGPU/GlobalISel: Select llvm.amdgcn.ds.ordered.{add|swap}Matt Arsenault2020-01-133-0/+11
|
* AMDGPU/GlobalISel: Add some baseline tests for vector extractMatt Arsenault2020-01-131-0/+468
| | | | | A future change will try to fold constant offsets into the loop which these will stress.
* AMDGPU/GlobalISel: Fix branch targets when emitting SI_IFMatt Arsenault2020-01-131-0/+61
| | | | | | | | The branch target needs to be changed depending on whether there is an unconditional branch or not. Loops also need to be similarly fixed, but compiling a simple testcase end to end requires another set of patches that aren't upstream yet.
* GlobalISel: Fix assertion on wide G_ZEXT sourcesMatt Arsenault2020-01-131-0/+25
| | | | It's possible to have a type that needs a mask greater than 64-bits.
* AMDGPU/GlobalISel: Don't use XEXEC class for SGPRsMatt Arsenault2020-01-1238-196/+193
| | | | | We don't use the xexec register classes for arbitrary values anymore. Avoids a test variance beween GlobalISel and SelectionDAG>
* AMDGPU/GlobalISel: Copy type when inserting readfirstlaneMatt Arsenault2020-01-128-29/+29
| | | | | getDefIgnoringCopies will fail to find any def if no type is set if we try to use it on the use's operand, so propagate the type.
* AMDGPU/GlobalISel: Clamp G_ZEXT source sizesMatt Arsenault2020-01-107-238/+967
| | | | | Also clamps G_SEXT/G_ANYEXT, but the implementation is more limited so fewer cases actually work.
* AMDGPU/GlobalISel: Select G_EXTRACT_VECTOR_ELTMatt Arsenault2020-01-092-0/+2099
| | | | | Doesn't try to do the fold into the base register of an add of a constant in the index like the DAG path does.
* AMDGPU/GlobalISel: Fix G_EXTRACT_VECTOR_ELT mapping for s-v caseMatt Arsenault2020-01-091-10/+16
| | | | | | If an SGPR vector is indexed with a VGPR, the actual indexing will be done on the SGPR and produce an SGPR. A copy needs to be inserted inside the waterwall loop to the VGPR result.
* TableGen/GlobalISel: Add way for SDNodeXForm to work on timmMatt Arsenault2020-01-091-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation assumes there is an instruction associated with the transform, but this is not the case for timm/TargetConstant/immarg values. These transforms should directly operate on a specific MachineOperand in the source instruction. TableGen would assert if you attempted to define an equivalent GISDNodeXFormEquiv using timm when it failed to find the instruction matcher. Specially recognize SDNodeXForms on timm, and pass the operand index to the render function. Ideally this would be a separate render function type that looks like void renderFoo(MachineInstrBuilder, const MachineOperand&), but this proved to be somewhat mechanically painful. Add an optional operand index which will only be passed if the transform should only look at the one source operand. Theoretically it would also be possible to only ever pass the MachineOperand, and the existing renderers would check the parent. I think that would be somewhat ugly for the standard usage which may want to inspect other operands, and I also think MachineOperand should eventually not carry a pointer to the parent instruction. Use it in one sample pattern. This isn't a great example, since the transform exists to satisfy DAG type constraints. This could also be avoided by just changing the MachineInstr's arbitrary choice of operand type from i16 to i32. Other patterns have nontrivial uses, but this serves as the simplest example. One flaw this still has is if you try to use an SDNodeXForm defined for imm, but the source pattern uses timm, you still see the "Failed to lookup instruction" assert. However, there is now a way to avoid it.
* GlobalISel: Handle llvm.read_registerMatt Arsenault2020-01-091-0/+2
| | | | | Compared to the attempt in bdcc6d3d2638b3a2c99ab3b9bfaa9c02e584993a, this uses intermediate generic instructions.
* AMDGPU/GlobalISel: Fix argument lowering for vectors of pointersMatt Arsenault2020-01-091-14/+116
| | | | | | | When these arguments are broken down by the EVT based callbacks, the pointer information is lost. Hack around this by coercing the register types to be the expected pointer element type when building the remerge operations.
* AMDGPU/GlobalISel: Widen 16-bit shift amount sourcesMatt Arsenault2020-01-093-80/+94
| | | | | | This should be legal, but will require future selection work. 16-bit shift amounts were already removed from being legal, but this didn't adjust the transformation rules.
* AMDGPU/GlobalISel: Fix import of integer med3Matt Arsenault2020-01-094-0/+616
| | | | | This isn't too useful now, since nothing is currently trying to form min/max from cmp+select.
* AMDGPU/GlobalISel: Fix import of zext of s16 op patternsMatt Arsenault2020-01-094-12/+138
|
* AMDGPU/GlobalISel: Fix add of neg inline constant patternMatt Arsenault2020-01-091-0/+113
|
* Revert "Revert "[MIR] Target specific MIR formating and parsing""Daniel Sanders2020-01-088-91/+91
| | | | | | | There was an unguarded dereference of MF in a function that permitted nullptr. Fixed This reverts commit 71d64f72f934631aa2f12b9542c23f74f256f494.
* Revert "[MIR] Target specific MIR formating and parsing"Nico Weber2020-01-088-91/+91
| | | | | This reverts commit 3ef05d85be8c3666ebfa3ad986eb334da5195a47. It broke check-llvm on many bots, see comments on D69836.
* [MIR] Target specific MIR formating and parsingPeng Guo2020-01-088-91/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Added MIRFormatter for target specific MIR formating and parsing with immediate and custom pseudo source values. Target machine can subclass MIRFormatter and implement custom logic for printing and parsing immediate and custom pseudo source values for better readability. * Target specific immediate mnemonic need to start with "." follows by identifier string. When MIR parser sees immediate it will call target specific parsing function. * Custom pseudo source value need to start with custom follows by double-quoted string. MIR parser will pass the quoted string to target specific PSV parsing function. * MIRFormatter have 2 helper functions to facilitate LLVM value printing and parsing for custom PSV if they refers LLVM values. Patch by Peng Guo Reviewers: dsanders, arsenm Reviewed By: dsanders Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69836
* Revert "[MIR] Target specific MIR formating and parsing"Daniel Sanders2020-01-088-91/+91
| | | | | | Forgot to credit Peng in the commit message. This reverts commit be841f89d0014b1e0246a4feae941b2f74abd908.
* [MIR] Target specific MIR formating and parsingPeng Guo2020-01-088-91/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Added MIRFormatter for target specific MIR formating and parsing with immediate and custom pseudo source values. Target machine can subclass MIRFormatter and implement custom logic for printing and parsing immediate and custom pseudo source values for better readability. * Target specific immediate mnemonic need to start with "." follows by identifier string. When MIR parser sees immediate it will call target specific parsing function. * Custom pseudo source value need to start with custom follows by double-quoted string. MIR parser will pass the quoted string to target specific PSV parsing function. * MIRFormatter have 2 helper functions to facilitate LLVM value printing and parsing for custom PSV if they refers LLVM values. Reviewers: dsanders, arsenm Reviewed By: dsanders Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69836
* AMDGPU/GlobalISel: Fix scalar G_SELECT for arbitrary pointersMatt Arsenault2020-01-072-1/+97
| | | | | 4e85ca9562a588eba491e44bcbf73cb2f419780f missed updating the legal condition type set for pointers with any unrecognized address space.
* AMDGPU/GlobalISel: Add some missing G_SELECT testcasesMatt Arsenault2020-01-071-0/+142
|
* AMDGPU/GlobalISel: Fix missing test for s16 icmpMatt Arsenault2020-01-071-0/+236
|
* AMDGPU/GlobalISel: Fix readfirstlane pattern importMatt Arsenault2020-01-071-0/+63
| | | | | The imm folding optimization pattern failed to import. The instruction pattern was already working, but failing to fail on SGPR inputs.
* AMDGPU/GlobalISel: Fix import of s_abs_i32 patternMatt Arsenault2020-01-071-0/+105
|
* AMDGPU/GlobalISel: Select llvm.amdgcn.wqm.voteMatt Arsenault2020-01-071-0/+3
|
* AMDGPU/GlobalISel: Legalize G_READCYCLECOUNTERMatt Arsenault2020-01-061-0/+3
|
* AMDGPU/GlobalISel: Select G_UADDE/G_USUBEMatt Arsenault2020-01-064-0/+318
|
* AMDGPU/GlobalISel: Replace handling of boolean valuesMatt Arsenault2020-01-0651-2036/+2185
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This solves selection failures with generated selection patterns, which would fail due to inferring the SGPR reg bank for virtual registers with a set register class instead of VCC bank. Use instruction selection would constrain the virtual register to a specific class, so when the def was selected later the bank no longer was set to VCC. Remove the SCC reg bank. SCC isn't directly addressable, so it requires copying from SCC to an allocatable 32-bit register during selection, so these might as well be treated as 32-bit SGPR values. Now any scalar boolean value that will produce an outupt in SCC should be widened during RegBankSelect to s32. Any s1 value should be a vector boolean during selection. This makes the vcc register bank unambiguous with a normal SGPR during selection. Summary of how this should now work: - G_TRUNC is always a no-op, and never should use a vcc bank result. - SALU boolean operations should be promoted to s32 in RegBankSelect apply mapping - An s1 value means vcc bank at selection. The exception is for legalization artifacts that use s1, which are never VCC. All other contexts should infer the VCC register classes for s1 typed registers. The LLT for the register is now needed to infer the correct register class. Extensions with vcc sources should be legalized to a select of constants during RegBankSelect. - Copy from non-vcc to vcc ensures high bits of the input value are cleared during selection. - SALU boolean inputs should ensure the inputs are 0/1. This includes select, conditional branches, and carry-ins. There are a few somewhat dirty details. One is that G_TRUNC/G_*EXT selection ignores the usual register-bank from register class functions, and can't handle truncates with VCC result banks. I think this is OK, since the artifacts are specially treated anyway. This does require some care to avoid producing cases with vcc. There will also be no 100% reliable way to verify this rule is followed in selection in case of register classes, and violations manifests themselves as invalid copy instructions much later. Standard phi handling also only considers the bank of the result register, and doesn't insert copies to make the source banks match. This doesn't work for vcc, so we have to manually correct phi inputs in this case. We should add a verifier check to make sure there are no phis with mixed vcc and non-vcc register bank inputs. There's also some duplication with the LegalizerHelper, and some code which should live in the helper. I don't see a good way to share special knowledge about what types to use for intermediate operations depending on the bank for example. Using the helper to replace extensions with selects also seems somewhat awkward to me. Another issue is there are some contexts calling getRegBankFromRegClass that apparently don't have the LLT type for the register, but I haven't yet run into a real issue from this. This also introduces new unnecessary instructions in most cases, since we don't yet try to optimize out the zext when the source is known to come from a compare.
* GlobalISel: Implement lower for G_INTRINSIC_ROUNDMatt Arsenault2020-01-062-54/+779
| | | | | Mostly copied from AMDGPU lowering implementation, except used G_SITOFP instead of directly creating a select on -1.0, 0.0.
* GlobalISel: Fix unsupported legalize actionMatt Arsenault2020-01-061-0/+78
| | | | | | | | This would complain about invalid legalizer rules otherwise. Mark some operations as unsupported for AMDGPU. This currently seems to produce the same legalize error as when no rules are defined, but eventually this should produce a proper user facing error.
* AMDGPU/GlobalISel: Select scalar v2s16 G_BUILD_VECTORMatt Arsenault2020-01-061-0/+239
|
* AMDGPU/GlobalISel: Select more G_EXTRACTs correctlyMatt Arsenault2020-01-061-0/+20
| | | | | | | | This assumed a 32-bit extract size, which would produce invalid copies with 64-bit extracts. Handle the easy case. Ideally we would have a way to get the proper subreg index for any 32-bit offset, but there should probably be a tablegenerated way of getting the subreg index for any size and offset.
* GlobalISel: Scalarize all division operationsMatt Arsenault2020-01-044-0/+1732
| | | | | | This only handled G_SDIV, but they all are trivially scalarizable. Also define placeholder AMDGPU division legalizer rules.
* AMDGPU/GlobalISel: Refine SMRD selection rulesMatt Arsenault2020-01-042-32/+150
| | | | | Fix selecting these for volatile global loads, and ensure the loads are constant enough.
* AMDGPU/GlobalISel: Legalize more odd sized loadsMatt Arsenault2020-01-044-302/+34
| | | | | The attempts to widen sufficently aligned, odd sized loads wasn't consistently applied.
* AMDGPU/GlobalISel: Assume vcc phis for any vcc inputMatt Arsenault2020-01-042-84/+66
| | | | | | | This produces more intelligible looking results, more comparabble to the DAG output in the simplest cases. This is probably wrong in complex control flow, but RegBankSelect doesn't attempt analyzing if this is on a masked path for selecting the bank yet.
* AMDGPU/GlobalISel: Fix off by one in operand indexMatt Arsenault2020-01-033-116/+78
| | | | This should be looking at the RHS of the add for a constant.
* AMDGPU/GlobalISel: Correct MMO sizes in some testsMatt Arsenault2020-01-024-932/+98
| | | | | There intended to test non-extloads, but the memory size did not match the result size.
* AMDGPU/GlobalISel: Regenerate check linesMatt Arsenault2020-01-027-13076/+13076
| | | | | This avoids diff noise in a future commit from the check name change from the G_GEP->G_PTR_ADD rename.
* AMDGPU/GlobalISel: Select mul24 intrinsicsMatt Arsenault2019-12-301-0/+65
|
* [MIPS GlobalISel] Select bitreverse. RecommitPetar Avramovic2019-12-301-6/+15
| | | | | | | | | | | | | | | G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Recommit notes: Introduce temporary variables in order to make sure instructions get inserted into MachineFunction in same order regardless of compiler used to build llvm. Differential Revision: https://reviews.llvm.org/D71363
* AMDGPU/GlobalISel: Select llvm.amdgcn.fmad.ftzMatt Arsenault2019-12-301-0/+233
|
* AMDGPU/GlobalISel: Add select test for fexp2Matt Arsenault2019-12-301-0/+42
|
* GlobalISel: moreElementsVector for FP min/maxMatt Arsenault2019-12-302-56/+28
|
* AMDGPU/GlobalISel: Account for G_PHI result bankMatt Arsenault2019-12-302-12/+98
| | | | | | | | | Sometimes the result bank of the phi is already assigned to something, and should not be ignored. This is in preparation for additional boolean phi handling changes. Also refine the logic to fix some cases that were incorrectly deciding to use SGPRs.
* Revert "[MIPS GlobalISel] Select bitreverse"Dmitri Gribenko2019-12-301-15/+6
| | | | | | This reverts commit dbc136e0fe7e14c64dcb78e72321bb41af60afa4. It broke buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066
OpenPOWER on IntegriCloud