summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
Commit message (Collapse)AuthorAgeFilesLines
* GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELTMatt Arsenault2019-10-251-0/+25
|
* GlobalISel: Implement lower for G_SADDO/G_SSUBOMatt Arsenault2019-10-161-0/+39
| | | | | | | Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. llvm-svn: 375042
* GlobalISel: Implement fewerElementsVector for G_BUILD_VECTORMatt Arsenault2019-10-091-0/+61
| | | | | | Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. llvm-svn: 374252
* GlobalISel: Partially implement lower for G_INSERTMatt Arsenault2019-10-071-0/+41
| | | | llvm-svn: 373946
* GlobalISel: Partially implement lower for G_EXTRACTMatt Arsenault2019-10-061-0/+35
| | | | | | Turn into shift and truncate. Doesn't yet handle pointers. llvm-svn: 373838
* GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sourcesMatt Arsenault2019-10-011-4/+6
| | | | | | Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287
* Add an operand to memory intrinsics to denote the "tail" marker.Amara Emerson2019-09-281-2/+4
| | | | | | | | | | | | | | We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140
* [GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time.Amara Emerson2019-09-201-3/+0
| | | | | | | | | | | | | | | | | | | We currently always set the HasCalls on MFI during translation and legalization if we're handling a call or legalizing to a libcall. However, if that call is later optimized to a tail call then we don't need the flag. The flag being set to true causes frame lowering to always save and restore FP/LR, which adds unnecessary code. This change does the same thing as SelectionDAG and ports over some code that scans instructions after selection, using TargetInstrInfo to determine if target opcodes are known calls. Code size geomean improvements on CTMark: -O0 : 0.1% -Os : 0.3% Differential Revision: https://reviews.llvm.org/D67868 llvm-svn: 372443
* [GlobalISel] Partially revert r371901.Amara Emerson2019-09-161-2/+2
| | | | | | | | | | | r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050
* [GlobalISel] Fix insertion point of new instructions to be after PHIs.Amara Emerson2019-09-131-3/+3
| | | | | | | | | | For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901
* [AArch64][GlobalISel] Tail call memory intrinsicsJessica Paquette2019-09-131-0/+43
| | | | | | | | | | | | | | | | | | | | | | Because memory intrinsics are handled differently than other calls, we need to check them for tail call eligiblity in the legalizer. This allows us to still inline them when it's beneficial to do so, but also tail call when possible. This adds simple tail calling support for when the intrinsic is followed by a return. It ports the attribute checks from `TargetLowering::isInTailCallPosition` into a similarly-named function in LegalizerHelper.cpp. The target-specific `isUsedByReturnOnly` hook is not ported here. Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call memory intrinsics. Update legalize-memcpy-et-al.mir to have a case where we don't tail call. Differential Revision: https://reviews.llvm.org/D67566 llvm-svn: 371893
* AMDGPU/GlobalISel: Legalize G_FMADMatt Arsenault2019-09-131-0/+15
| | | | | | | | | | | | | | | Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800
* GlobalISel: Add G_FMAD instructionMatt Arsenault2019-09-061-0/+2
| | | | llvm-svn: 371254
* GlobalISel: Add basic legalization for G_BITREVERSEMatt Arsenault2019-09-041-0/+19
| | | | llvm-svn: 370979
* [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type ↵Amara Emerson2019-09-041-3/+7
| | | | | | | | | | | combination. Similar to the issue with G_ZEXT that was fixed earlier, this is a quick to fall back if the source type is not exactly half of the dest type. Fixes the clang-cmake-aarch64-lld bot build. llvm-svn: 370847
* [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.Amara Emerson2019-09-031-4/+22
| | | | | | | | | Now that we have the infrastructure to support s128 types as parameters we can expand these to libcalls. Differential Revision: https://reviews.llvm.org/D66185 llvm-svn: 370823
* [AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creatingAmara Emerson2019-09-021-3/+5
| | | | | | | | the merges. Fixes PR43171. llvm-svn: 370627
* [MIPS GlobalISel] Lower fptouiPetar Avramovic2019-08-301-0/+44
| | | | | | | | | | Add lower for G_FPTOUI. Algorithm is similar to the SDAG version in TargetLowering::expandFP_TO_UINT. Lower G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D66929 llvm-svn: 370431
* [GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC.Amara Emerson2019-08-271-0/+38
| | | | | | | | | | | This change moves the actual stack pointer manipulation into the legalizer, available to targets via lower(). The codegen is slightly different because we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than adding a negative amount via G_GEP. Differential Revision: https://reviews.llvm.org/D66678 llvm-svn: 370104
* [GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFCPetar Avramovic2019-08-271-27/+11
| | | | | | | | | Main difference is in the way Hi for Long shift (HiL) is made. G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value. Differential Revision: https://reviews.llvm.org/D66589 llvm-svn: 370064
* [GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAGPetar Avramovic2019-08-271-10/+10
| | | | | | | | | Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062
* GlobalISel: Don't create G_UADDE with constant false carry inMatt Arsenault2019-08-221-5/+7
| | | | | | | | The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673
* GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sourcesMatt Arsenault2019-08-211-0/+20
| | | | | | | | This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547
* [MIPS GlobalISel] NarrowScalar G_TRUNCPetar Avramovic2019-08-211-0/+15
| | | | | | | | | Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source. NarrowScalar G_TRUNC to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66202 llvm-svn: 369509
* [AArch64][GlobalISel] Add support for narrowScalar of G_ZEXTAmara Emerson2019-08-211-0/+18
| | | | | | | | We do this by merging the source with the high bits set to 0. Differential Revision: https://reviews.llvm.org/D66181 llvm-svn: 369480
* [AArch64][GlobalISel] Lower G_SHUFFLE_VECTOR with 1 elt src and 1 elt mask.Amara Emerson2019-08-161-1/+17
| | | | | | | | Again, it's weird that these are allowed. Since lowering support was added in r368709 we started crashing on compiling the neon intrinsics test in the test suite. This fixes the lowering to fold the 1 elt src/mask case into copies. llvm-svn: 369135
* [GlobalISel]: Fix lowering of G_Shuffle_vector where we pick up the wrong ↵Aditya Nandakumar2019-08-141-1/+1
| | | | | | | | source index https://reviews.llvm.org/D66182 llvm-svn: 368781
* [GlobalISel]: Fix lowering of G_SHUFFLE_VECTOR with scalar sourcesAditya Nandakumar2019-08-131-5/+10
| | | | | | https://reviews.llvm.org/D66171 llvm-svn: 368753
* GlobalISel: Partially implement fewerElementsVector G_UNMERGE_VALUESMatt Arsenault2019-08-131-0/+62
| | | | | | Odd sized vectors aren't handled yet. llvm-svn: 368713
* GlobalISel: Implement lower for G_SHUFFLE_VECTORMatt Arsenault2019-08-131-0/+40
| | | | llvm-svn: 368709
* [globalisel] Add G_SEXT_INREGDaniel Sanders2019-08-091-0/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Targets often have instructions that can sign-extend certain cases faster than the equivalent shift-left/arithmetic-shift-right. Such cases can be identified by matching a shift-left/shift-right pair but there are some issues with this in the context of combines. For example, suppose you can sign-extend 8-bit up to 32-bit with a target extend instruction. %1:_(s32) = G_SHL %0:_(s32), i32 24 # (I've inlined the G_CONSTANT for brevity) %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_ASHR %2:_(s32), i32 1 would reasonably combine to: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 25 which no longer matches the special case. If your shifts and extend are equal cost, this would break even as a pair of shifts but if your shift is more expensive than the extend then it's cheaper as: %2:_(s32) = G_SEXT_INREG %0:_(s32), i32 8 %3:_(s32) = G_ASHR %2:_(s32), i32 1 It's possible to match the shift-pair in ISel and emit an extend and ashr. However, this is far from the only way to break this shift pair and make it hard to match the extends. Another example is that with the right known-zeros, this: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 24 %3:_(s32) = G_MUL %2:_(s32), i32 2 can become: %1:_(s32) = G_SHL %0:_(s32), i32 24 %2:_(s32) = G_ASHR %1:_(s32), i32 23 All upstream targets have been configured to lower it to the current G_SHL,G_ASHR pair but will likely want to make it legal in some cases to handle their faster cases. To follow-up: Provide a way to legalize based on the constant. At the moment, I'm thinking that the best way to achieve this is to provide the MI in LegalityQuery but that opens the door to breaking core principles of the legalizer (legality is not context sensitive). That said, it's worth noting that looking at other instructions and acting on that information doesn't violate this principle in itself. It's only a violation if, at the end of legalization, a pass that checks legality without being able to see the context would say an instruction might not be legal. That's a fairly subtle distinction so to give a concrete example, saying %2 in: %1 = G_CONSTANT 16 %2 = G_SEXT_INREG %0, %1 is legal is in violation of that principle if the legality of %2 depends on %1 being constant and/or being 16. However, legalizing to either: %2 = G_SEXT_INREG %0, 16 or: %1 = G_CONSTANT 16 %2:_(s32) = G_SHL %0, %1 %3:_(s32) = G_ASHR %2, %1 depending on whether %1 is constant and 16 does not violate that principle since both outputs are genuinely legal. Reviewers: bogner, aditya_nandakumar, volkan, aemerson, paquette, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, kristof.beyls, javed.absar, hiraditya, jrtc27, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61289 llvm-svn: 368487
* GlobalISel: pack various parameters for lowerCall into a struct.Tim Northover2019-08-091-5/+14
| | | | | | | | | I've now needed to add an extra parameter to this call twice recently. Not only is the signature getting extremely unwieldy, but just updating all of the callsites and implementations is a pain. Putting the parameters in a struct sidesteps both issues. llvm-svn: 368408
* Re-commit "[GlobalISel] Add legalization support for non-power-2 loads and ↵Amara Emerson2019-08-021-4/+95
| | | | | | | | | | stores"" This is an old commit that exposed a bug in the GISel importer, which caused non-truncating stores to be selected for truncating store patterns. Now that's been fixed in r367737 this can go back in. llvm-svn: 367739
* GlobalISel: Lower scalarizing unmerge of a vector to shiftsMatt Arsenault2019-08-011-0/+35
| | | | | | | | | | | | | | AMDGPU sometimes has legal s16 and <2 x s16> operations, but all registers are really 32-bit. An unmerge destination really should ben widened to a 32-bit register. If widening a scalarizing vector with a target size that matches the vector size, bitcast to integer and extract the relevant bits with shifts. I'm not sure if this is the right place for this. This could arguably be part of widenScalar for the result. I also have a growing feeling that we're missing a bitcast legalize action. llvm-svn: 367604
* GlobalISel: Fix widenScalar for G_MERGE_VALUES to pointerMatt Arsenault2019-08-011-1/+3
| | | | | | | AMDGPU testcase isn't broken now, but will be in a future patch without this. llvm-svn: 367591
* GlobalISel: moreElementsVector for G_LOAD/G_STOREMatt Arsenault2019-08-011-1/+11
| | | | | | | AMDGPU change and test is a placeholder until a future patch with complete handling. llvm-svn: 367503
* [AArch64][GlobalISel] Implement narrowing of G_SEXT.Amara Emerson2019-07-261-0/+20
| | | | | | | | We need this to narrow a sext to s128. Differential Revision: https://reviews.llvm.org/D65357 llvm-svn: 367164
* [AArch64][GlobalISel] Fix a crash during s128 G_ICMP legalization due to ↵Amara Emerson2019-07-241-4/+4
| | | | | | | | | | | r366317. r366317 added a legalization for s128 G_ICMP narrow scalar which tried to hard code the result type of the new legalized G_SELECT. Change this to instead use type of the original G_ICMP result and allow the target to legalize it if necessary later. llvm-svn: 366943
* [GlobalISel] Translate calls to memcpy et al to G_INTRINSIC_W_SIDE_EFFECTs ↵Amara Emerson2019-07-191-0/+49
| | | | | | | | | | | | | | and legalize later. I plan on adding memcpy optimizations in the GlobalISel pipeline, but we can't do that unless we delay lowering to actual function calls. This patch changes the translator to generate G_INTRINSIC_W_SIDE_EFFECTS for these functions, and then have each target specify that using the new custom legalizer for intrinsics hook that they want it expanded it a libcall. Differential Revision: https://reviews.llvm.org/D64895 llvm-svn: 366516
* GlobalISel: Handle widenScalar of arbitrary G_MERGE_VALUES sourcesMatt Arsenault2019-07-171-48/+84
| | | | | | | | | | | Extract the sources to the GCD of the original size and target size, padding with implicit_def as necessary. Also fix the case where the requested source type is wider than the original result type. This was ignoring the type, and just using the destination. Do the operation in the requested type and truncate back. llvm-svn: 366367
* GlobalISel: Handle more cases for widenScalar of G_MERGE_VALUESMatt Arsenault2019-07-171-4/+23
| | | | | | | | | | | | Use an anyext to the requested type for the leftover operand to produce a slightly wider type, and then truncate the final merge. I have another implementation almost ready which handles arbitrary widens, but I think it produces worse code in this example (which I think is 90% due to not folding redundant copies or folding out implicit_def users), so I wanted to add this as a baseline first. llvm-svn: 366366
* [MIPS GlobalISel] ClampScalar and select pointer G_ICMPPetar Avramovic2019-07-171-0/+36
| | | | | | | | | | | Add narrowScalar to half of original size for G_ICMP. ClampScalar G_ICMP's operands 2 and 3 to to s32. Select G_ICMP for pointers for MIPS32. Pointer compare is same as for integers, it is enough to declare them as legal type. Differential Revision: https://reviews.llvm.org/D64856 llvm-svn: 366317
* GlobalISel: Implement narrowScalar for vector extract/insert indexesMatt Arsenault2019-07-151-0/+11
| | | | llvm-svn: 366113
* Delete dead storesFangrui Song2019-07-121-3/+1
| | | | llvm-svn: 365903
* GlobalISel: Legalization for G_FMINNUM/G_FMAXNUMMatt Arsenault2019-07-101-0/+46
| | | | llvm-svn: 365658
* GlobalISel: Implement lower for G_FCOPYSIGNMatt Arsenault2019-07-091-0/+50
| | | | | | | | | In SelectionDAG AMDGPU treated these as legal, but this was mostly because the bitcasts required for FP types were painful. Theoretically the bitpattern should eventually match to bfi, so don't bother trying to get the patterns to import. llvm-svn: 365583
* [MIPS GlobalISel] Register bank select for G_PHI. Select i64 phiPetar Avramovic2019-07-091-0/+28
| | | | | | | | | | | | | | | Select gprb or fprb when def/use register operand of G_PHI is used/defined by either: copy to/from physical register or instruction with only one mapping available for that use/def operand. Integer s64 phi is handled with narrowScalar when mapping is applied, produced artifacts are combined away. Manually set gprb to all register operands of instructions created during narrowScalar. Differential Revision: https://reviews.llvm.org/D64351 llvm-svn: 365494
* GlobalISel: widenScalar for G_BUILD_VECTORMatt Arsenault2019-07-081-0/+19
| | | | llvm-svn: 365320
* GlobalISel: Fix widenScalar for pointer typed G_MERGE_VALUESMatt Arsenault2019-07-031-1/+1
| | | | llvm-svn: 365093
* GlobalISel: Try to widen merges with other mergesMatt Arsenault2019-07-011-2/+28
| | | | | | | | If the requested source type an be used as a merge source type, create a merge of merges. This avoids creating large, illegal extensions and bit-ops directly to the result type. llvm-svn: 364841
OpenPOWER on IntegriCloud