summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen/SelectionDAG
Commit message (Collapse)AuthorAgeFilesLines
...
* Revert "Allow output constraints on "asm goto""Bill Wendling2020-01-071-2/+0
| | | | | | This reverts commit 52366088a8e42c2f1e96e8430b84b8b65ec3f7bc. I accidentally pushed this before supporting changes.
* Allow output constraints on "asm goto"Bill Wendling2020-01-071-0/+2
| | | | | | | | | | | | | | | | | Summary: Remove the restrictions that preventing "asm goto" from returning non-void values. The values returned by "asm goto" are only valid on the "fallthrough" path. Reviewers: jyknight, nickdesaulniers, hfinkel Reviewed By: jyknight, nickdesaulniers Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69876
* [DAGCombiner] reduce shuffle of concat of same vectorSanjay Patel2020-01-071-0/+24
| | | | | | | | | | | | | | | | | This is possibly a small part towards solving PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 The vectorizer is creating shuffles of concat like this: %63 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %64 = shufflevector <8 x i64> %63, <8 x i64> undef, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> That might be fixable in the vectorizers, but we're not allowed to fold that into a single shuffle in instcombine, so we should have a backend backstop to convert that into the likely simpler form: %64 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3> Differential Revision: https://reviews.llvm.org/D72300
* [LegalizeTypes] Add widening support for STRICT_FSETCC/FSETCCSCraig Topper2020-01-062-0/+86
| | | | | | This patch adds widening which really just scalarizes because we don't have a strategy for the extra elements we would need to pad with. Differential Revision: https://reviews.llvm.org/D72193
* [DAG] DAGCombiner::XformToShuffleWithZero - use APInt::extractBits helper. NFCI.Simon Pilgrim2020-01-061-8/+4
|
* [NFC] Fix trivial typos in commentsJames Henderson2020-01-062-2/+2
| | | | | | | | Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.
* [TargetLowering] Use SETCC input type to call getBooleanContents instead of ↵Craig Topper2020-01-051-1/+1
| | | | | | | | | the setcc result type. This isn't a functonal change since we also check the bit width is the same and the input type is integer. This guarantees the input and output type are the same. But passing the input type makes the code more readable.
* [DAGCombine] Don't check the legality of type when combine the SIGN_EXTEND_INREGQingShan Zhang2020-01-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | This is the DAG node for SIGN_EXTEND_INREG : t21: v4i32 = sign_extend_inreg t18, ValueType:ch:v4i16 It has two operands. The first one is the value it want to extend, and the second one is the type to specify how to extend the value. For this example, it means that, it is signed extend the t18(v4i32) from v4i16 to v4i32. That is the semantics of c code: vector int foo(vector int m) { return m << 16 >> 16; } And it could be any vector type that hardware support the operation, though the type 'v4i16' is NOT legal for the target. When we are trying to combine the srl + sra, what we did now is calling the TLI.isOperationLegal(), which will also check the legality of the type. That doesn't make sense. Differential Revision: https://reviews.llvm.org/D70230
* [LegalizeVectorOps][X86] Enable expansion of vector fp_to_uint in ↵Craig Topper2020-01-041-1/+5
| | | | | | | | | | | | | LegalizeVectorOps to avoid scalarization. The code here isn't great in all caess. Particularly v4f64->v4i32 on 64-bit AVX targets. But there is some improvement in some configurations. There's definitely some issues with computeNumSignBits with X86ISD::STRICT_FCMP. As well as not being able to propagate sign bits through merge_values nodes that get created during custom legalization.
* [TargetLowering] In expandFP_TO_UINT, add proper extend or truncate for the ↵Craig Topper2020-01-041-0/+4
| | | | | | | | | | | | condition to feed the DstVT select. Previously, for vectors we created a vselect with a condition that didn't match what the target wanted according to getSetCCResultType. To make up for this, X86 had a special DAG combine to detect if the condition was all sign bits and then insert its own truncate or extend. By adding the extend/truncate here explicitly we can avoid that.
* [LegalizeVectorOps] Split most of ExpandStrictFPOp into a separate ↵Craig Topper2020-01-041-6/+13
| | | | | | | | | | | UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT. ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT returned SDValue() if it wasn't able to handle and needed to unroll. Then ExpandStrictFPOp would detect his SDValue() and do the unroll. After this change, ExpandUINT_TO_FLOAT will directly call UnrollStrictFPOp and return the unrolled result.
* [TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits ↵Simon Pilgrim2020-01-041-0/+11
| | | | | | | | | | | | | | for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887
* [DAGCombiner] fix miscompile in translating (X & undef) to shuffleSanjay Patel2020-01-031-1/+3
| | | | | See PR42982 for more context: https://bugs.llvm.org/show_bug.cgi?id=42982
* [LegalizeVectorOps] Pass the post-UpdateNodeOperands version of Op to ↵Craig Topper2020-01-031-11/+14
| | | | | | | | | | ExpandLoad/ExpandStore UpdateNodeOperands might CSE to another existing node. So we should make sure we're legalizing that node otherwise we might fail to hook up the operands properly. I've moved the result registration up to the caller to avoid having to pass both Result and Op into the functions where it might be confusing which is which. This address 2 other issues pointed out in D71861. Differential Revision: https://reviews.llvm.org/D72021
* Move tail call disabling code to target independent codeReid Kleckner2020-01-033-7/+23
| | | | | | | | | | | | | | | | | When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in d9699bc7bdf0362173fcd256690f61a4d47429c2 (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118
* [DAGCombiner][X86][AArch64] Generalize `A-(A&B)`->`A&(~B)` fold (PR44448)Roman Lebedev2020-01-031-20/+9
| | | | | | | | | | | | | | | | | | | | | | | The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in 8dab0a4a7d691f2704f1079538e0ef29548db159 is too specific. It should/can just be 'A - (A & B)' -> 'A & (~B)' Even if we don't manage to fold `~` into B, we have likely formed `ANDN` node. Also, this way there's less similar-but-duplicate folds. Name: X - (X & Y) -> X & (~Y) %o = and i32 %X, %Y %r = sub i32 %X, %o => %n = xor i32 %Y, -1 %r = and i32 %X, %n https://rise4fun.com/Alive/kOUl See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* [DAGCombiner] `~(add X, -1)` -> `neg X` foldRoman Lebedev2020-01-031-0/+7
| | | | | | | | | | | | | | | The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in 8dab0a4a7d691f2704f1079538e0ef29548db159 is too specific. It should just be 'A - (A & B)' -> 'A & (~B)', but we currently fail to sink that '~' into `(B - 1)`. Name: ~(X - 1) -> (0 - X) %o = add i32 %X, -1 %r = xor i32 %o, -1 => %r = sub i32 0, %X https://rise4fun.com/Alive/rjU
* [DAGCombine][X86][Thumb2/LowOverheadLoops] `A - (A & C)` -> `A & (~C)` fold ↵Roman Lebedev2020-01-031-0/+10
| | | | | | | | | | | | | | | | | | | | | (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. Name: PR44448 ptr - (ptr & C) -> ptr & (~C) %bias = and i32 %ptr, C %r = sub i32 %ptr, %bias => %r = and i32 %ptr, ~C See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* [NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' foldRoman Lebedev2020-01-031-1/+1
|
* [DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448)Roman Lebedev2020-01-031-0/+15
| | | | | | | | | | | | | | | | | | | | | | | While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* DAG: Use TargetConstant for FENCE operandsMatt Arsenault2020-01-021-4/+4
|
* [SelectionDAG] Simplify SelectionDAGBuilder::visitInlineAsmFangrui Song2020-01-021-3/+1
|
* [FPEnv] Default NoFPExcept SDNodeFlag to falseUlrich Weigand2020-01-025-10/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization. This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again. In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception. To check whether or not an SD node should be considered as raising an FP exception, the following logic applies: - For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71841
* [NFC] Fixes -Wrange-loop-analysis warningsMark de Wever2020-01-011-1/+1
| | | | | | This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857
* DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nszMatt Arsenault2019-12-312-5/+10
| | | | | | | | | | | | | | This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.
* [LegalizeVectorOps][AArch64] Stop asking for v4f16 fp_round and fp_extend to ↵Craig Topper2019-12-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | be promoted. These operations are needed as building blocks for promoting so they can't be promoted themselves. This appeared to work because the fp_extend query type for operation actions is the result type, not the input type so it never triggered in the legalizer. For fp_round, the vector op legalizer just ended up creating a nop fp_extend that was elided by getNode, followed by a nop fp_round that was also elided by getNode. This was followed by a final fp_round from v4f32 back to vf416 which was CSEd to the original node. Then legalize vector ops just believed that node legalized to itself. LegalizeDAG took another crack at promoting it, but didn't have a handler so just skipped it with a debug message saying it wasn't promoted. This patch just removes the operation actions to avoid this non-sense. Found while trying to refactor LegalizeVectorOps to handle multiple result nodes better.
* [TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues ↵Craig Topper2019-12-302-16/+6
| | | | | | | | | | | instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.
* [SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsmFangrui Song2019-12-291-11/+3
| | | | | | Indirect C_Immediate or C_Other constraints have been excluded. Also simplify an unneeded change to indirect 'X' by D60942.
* [SelectionDAG] Disallow indirect "i" constraintFangrui Song2019-12-291-0/+6
| | | | | | | | | This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.
* SimplifyDemandedBits - Remove duplicate getOperand() call. NFC.Simon Pilgrim2019-12-281-9/+7
| | | | Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.
* [TargetLowering] Update comment to reference the correct compiler-rt ↵Craig Topper2019-12-271-1/+1
| | | | function the code is based on. NFC
* Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removedFangrui Song2019-12-271-8/+0
|
* Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821Fangrui Song2019-12-271-6/+0
| | | | | | | Intrinsic has incorrect argument type! i32 (i32*)* @llvm.setjmp *wipes tear*
* [X86][FPEnv] Promote some float strictfp operations to double on ↵Craig Topper2019-12-261-0/+31
| | | | | | | | i686-pc-windows-msvc to match what we do for non-strict. The float libcalls are inlined in MSVC's math header where they just cast to double and use the double libcall. Do the same when we emit libcalls.
* [DebugInfo][SelectionDAG] Change order while transferring SDDbgValue to ↵Kristina Bessonova2019-12-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | another node SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to another node, but doesn't change its source order. If the destination node has the order greater than the SDDbgValue, there are two possible issues revealed later: * If debug info is attached to an instruction that is the first definition of a register, this ends up with a def-after-use and the debug info gets 'undef' later. * If MIR has another definition of a register above the debug info, the debug info may represent a source variable incorrectly because it appears (significantly) before an instruction corresponded to this debug info. So, the patch changes the order of an SDDbgValue when it is moved to a node with greater order. Reviewers: dblaikie, jmorse, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71175
* [X86] Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backendWang, Pengfei2019-12-261-3/+12
| | | | | | | | | | | | Summary: Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Reviewers: craig.topper, RKSimon, LiuChen3, uweigand, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71871
* [SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptrFangrui Song2019-12-232-24/+24
| | | | | CurDAG is referenced more than 2000 times and used in many gerated .cpp files. Don't touch it for now.
* [SelectionDAG] Don't repeatedly add a node to the worklist in ↵Fangrui Song2019-12-231-6/+3
| | | | | | ComputeLiveOutVRegInfo. NFC For sqlite3 amalgram, this decreases the number of Worklist.push_back calls (603084) by 10%.
* [FPEnv][X86] More strict int <-> FP conversion fixesUlrich Weigand2019-12-232-25/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840
* [DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' ↵Sanjay Patel2019-12-231-1/+17
| | | | | | | | | | | | | | | | | | | extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815
* [DAGCombiner] Check term use before applying aggressive FSUB optimisationsCarl Ritson2019-12-231-3/+6
| | | | | | | | | | | | | | | | Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656
* [SelectionDAG] Copy FP flags when visiting a binary instruction.Valentin Churavy2019-12-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6. ``` %29 = fmul contract <4 x double> %wide.load, %wide.load16 %30 = fmul contract <4 x double> %wide.load13, %wide.load17 %31 = fmul contract <4 x double> %wide.load14, %wide.load18 %32 = fmul contract <4 x double> %wide.load15, %wide.load19 %33 = fadd fast <4 x double> %vec.phi, %29 %34 = fadd fast <4 x double> %vec.phi10, %30 %35 = fadd fast <4 x double> %vec.phi11, %31 %36 = fadd fast <4 x double> %vec.phi12, %32 ``` Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function attribute, but rather emits more local instruction flags. This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further. Reviewers: spatel, mcberg2017 Reviewed By: spatel Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71495
* [LegalizeDAG] Add return to the strict node handling in ↵Craig Topper2019-12-191-0/+1
| | | | PromoteLegalINT_TO_FP to prevent an invalid strict fp node from being created by falling into non-strict code path.
* Enable STRICT_FP_TO_SINT/UINT on X86 backendLiu, Chen32019-12-191-5/+26
| | | | | | This patch is mainly for custom lowering the vector operation. Differential Revision: https://reviews.llvm.org/D71592
* [FPEnv] Strict versions of llvm.minimum/llvm.maximumUlrich Weigand2019-12-181-0/+2
| | | | | | | | | | | | | Add new intrinsics llvm.experimental.constrained.minimum llvm.experimental.constrained.maximum as strict versions of llvm.minimum and llvm.maximum. Includes SystemZ back-end support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71624
* [SelectionDAGBuilder] Use getConstant instead of getTargetConstant to build ↵Craig Topper2019-12-181-2/+2
| | | | | | | | | | | | the offset for struct types in getUniformBase. getTargetConstant prevents any optimizations from operating on the value and basically says its already been iseled. But since we want the index to be in a register, this isn't true. Prior to this we were generating a vbroadcast with an immediate argument which is illegal and was flagged by the expensive checks bot.
* Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISelstozer2019-12-181-2/+19
| | | | | | | | This reverts commit 1f3dd83cc1f2b8f72b9d59e2b4221b12fb7f9a95, reapplying commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.
* Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel"stozer2019-12-181-19/+2
| | | | | | Reverted due to build failure on windows bots. This reverts commit bb1b0bc4e57428ce364d3d6c075ff03cb8973462.
* [DebugInfo] Correctly handle salvaged casts and split fragments at ISelstozer2019-12-181-2/+19
| | | | | | | | | | | | | | | Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.
* [X86] Add calculation for elements in structures in getting uniform base for ↵Wang, Pengfei2019-12-181-6/+28
| | | | | | | | | | | | | | | the Gather/Scatter intrinsic. Summary: Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Reviewers: craig.topper, c-rhodes, RKSimon Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71442
OpenPOWER on IntegriCloud