summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [X86] Print %st(0) as %st to match what gcc inline asm uses as the clobber ↵Craig Topper2019-02-034-25/+24
| | | | | | | | | | | | | | | | | | | name to make MS inline asm work correctly Summary: When calculating clobbers for MS style inline assembly we fail if the asm clobbers stack top because we print st(0) and try to pass it through the gcc register name check. This was found with when I attempted to make a emms/femms clobber all ST registers. If you use emms/femms in MS inline asm we would try to use st(0) as the clobber name but clang would think that wasn't a valid clobber name. This also matches what objdump disassembly prints. It's also what is printed by gcc -S. Reviewers: RKSimon, rnk, efriedma, spatel, andreadb, lebedev.ri Reviewed By: rnk Subscribers: eraman, gbedwell, lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D57621 llvm-svn: 352985
* [X86] Lower ISD::UADDO to use the Z flag instead of C flag when the RHS is a ↵Craig Topper2019-02-031-1/+7
| | | | | | | | | | | | | | | | | | | | | constant 1 to encourage INC formation. Summary: Add an additional combine to combineCarryThroughADD to reverse it back to the C flag to avoid regressions. I believe this catches the cases that D57547 got. Reviewers: RKSimon, spatel Reviewed By: spatel Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57637 llvm-svn: 352984
* [AMDGPU] Fix -Wunused-variable after rL352978Fangrui Song2019-02-031-1/+0
| | | | llvm-svn: 352982
* [InstSimplify] Missed optimization in math expression: log10(pow(10.0,x)) == ↵Dmitry Venikov2019-02-031-1/+9
| | | | | | | | | | | | | | | | x, log2(pow(2.0,x)) == x Summary: This patch enables folding following instructions under -ffast-math flag: log10(pow(10.0,x)) -> x, log2(pow(2.0,x)) -> x Reviewers: hfinkel, spatel, efriedma, craig.topper, zvi, majnemer, lebedev.ri Reviewed By: spatel, lebedev.ri Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D41940 llvm-svn: 352981
* GlobalISel: Implement widenScalar for G_UNMERGE_VALUESMatt Arsenault2019-02-032-40/+85
| | | | | | | | | For the scalar case only. Also move the similar G_MERGE_VALUES handling to a separate function and cleanup to make them look more similar. llvm-svn: 352979
* GlobalISel: Implement widenScalar for G_EXTRACT vector sourcesMatt Arsenault2019-02-022-0/+44
| | | | | | Handle the basic element extract case. llvm-svn: 352978
* AMDGPU/GlobalISel: Avoid reporting illegal extloads as legalMatt Arsenault2019-02-021-1/+1
| | | | | | This avoids breaking a test in a future commit. llvm-svn: 352977
* AMDGPU/GlobalISel: Legalize icmp for pointer typesMatt Arsenault2019-02-021-1/+10
| | | | llvm-svn: 352976
* AMDGPU/GlobalISel: Legalize constant for pointer typesMatt Arsenault2019-02-021-3/+4
| | | | llvm-svn: 352975
* AMDGPU/GlobalISel: Legalize select for pointer typesMatt Arsenault2019-02-021-4/+12
| | | | llvm-svn: 352974
* GlobalISel: Legalization for inttoptr/ptrtointMatt Arsenault2019-02-023-10/+90
| | | | llvm-svn: 352973
* [X86][AVX] Enable INSERT_SUBVECTOR(SRC0, SHUFFLE(SRC1)) shuffle combiningSimon Pilgrim2019-02-021-15/+27
| | | | | | | | | | Push the insert_subvector up through the shuffle operands to help find more cross-lane shuffles. The is exposes a couple of minor issues that will be fixed shortly: Missed broadcast folds - we have a mixture of vzext_load lengths that need cleaning up combine-sdiv.ll - AVX1 SimplifyDemandedVectorElts failure (hits max depth due to a couple of extra bitcasts). llvm-svn: 352963
* [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.Simon Pilgrim2019-02-022-16/+12
| | | | | | | | We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts....... I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary. llvm-svn: 352961
* [LCSSA] Handle case with single new PHI faster.Florian Hahn2019-02-021-3/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is only a single available value, all uses must be dominated by the single value and there is no need to search for a reaching definition. This drastically speeds up LCSSA in some cases. For the test case from PR37202, it speeds up LCSSA construction by 4 times. Time-passes without this patch for test case from PR37202: Total Execution Time: 29.9285 seconds (29.9276 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 5.2786 ( 17.7%) 0.0021 ( 1.2%) 5.2806 ( 17.6%) 5.2808 ( 17.6%) Unswitch loops 4.3739 ( 14.7%) 0.0303 ( 18.1%) 4.4042 ( 14.7%) 4.4042 ( 14.7%) Loop-Closed SSA Form Pass 4.2658 ( 14.3%) 0.0192 ( 11.5%) 4.2850 ( 14.3%) 4.2851 ( 14.3%) Loop-Closed SSA Form Pass #2 2.2307 ( 7.5%) 0.0013 ( 0.8%) 2.2320 ( 7.5%) 2.2318 ( 7.5%) Loop Invariant Code Motion 2.0888 ( 7.0%) 0.0012 ( 0.7%) 2.0900 ( 7.0%) 2.0897 ( 7.0%) Unroll loops 1.6761 ( 5.6%) 0.0013 ( 0.8%) 1.6774 ( 5.6%) 1.6774 ( 5.6%) Value Propagation 1.3686 ( 4.6%) 0.0029 ( 1.8%) 1.3716 ( 4.6%) 1.3714 ( 4.6%) Induction Variable Simplification 1.1457 ( 3.8%) 0.0010 ( 0.6%) 1.1468 ( 3.8%) 1.1468 ( 3.8%) Loop-Closed SSA Form Pass #4 1.1384 ( 3.8%) 0.0005 ( 0.3%) 1.1389 ( 3.8%) 1.1389 ( 3.8%) Loop-Closed SSA Form Pass #6 1.1360 ( 3.8%) 0.0027 ( 1.6%) 1.1387 ( 3.8%) 1.1387 ( 3.8%) Loop-Closed SSA Form Pass #5 1.1331 ( 3.8%) 0.0010 ( 0.6%) 1.1341 ( 3.8%) 1.1340 ( 3.8%) Loop-Closed SSA Form Pass #3 Time passes with this patch Total Execution Time: 19.2802 seconds (19.2813 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 4.4234 ( 23.2%) 0.0038 ( 2.0%) 4.4272 ( 23.0%) 4.4273 ( 23.0%) Unswitch loops 2.3828 ( 12.5%) 0.0020 ( 1.1%) 2.3848 ( 12.4%) 2.3847 ( 12.4%) Unroll loops 1.8714 ( 9.8%) 0.0020 ( 1.1%) 1.8734 ( 9.7%) 1.8735 ( 9.7%) Loop Invariant Code Motion 1.7973 ( 9.4%) 0.0022 ( 1.2%) 1.7995 ( 9.3%) 1.8003 ( 9.3%) Value Propagation 1.4010 ( 7.3%) 0.0033 ( 1.8%) 1.4043 ( 7.3%) 1.4044 ( 7.3%) Induction Variable Simplification 0.9978 ( 5.2%) 0.0244 ( 13.1%) 1.0222 ( 5.3%) 1.0224 ( 5.3%) Loop-Closed SSA Form Pass #2 0.9611 ( 5.0%) 0.0257 ( 13.8%) 0.9868 ( 5.1%) 0.9868 ( 5.1%) Loop-Closed SSA Form Pass 0.5856 ( 3.1%) 0.0015 ( 0.8%) 0.5871 ( 3.0%) 0.5869 ( 3.0%) Unroll loops #2 0.4132 ( 2.2%) 0.0012 ( 0.7%) 0.4145 ( 2.1%) 0.4143 ( 2.1%) Loop Invariant Code Motion #3 Reviewers: efriedma, davide, mzolotukhin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D57033 llvm-svn: 352960
* [LCSSA] Add expensive verification of LCSSA form for sub-loops.Florian Hahn2019-02-021-0/+6
| | | | | | | | | | | | | | This assertion makes sure all sub-loops are in LCSSA form before bringing their parent in LCSSA form. This precondition was added to formLCSSA in D56848. Reviewers: davide, efriedma, mzolotukhin Reviewed By: davide Differential Revision: https://reviews.llvm.org/D56921 llvm-svn: 352958
* [BPF] [BTF] Process FileName with absolute path correctlyYonghong Song2019-02-021-1/+1
| | | | | | | | | | | | | | | | | In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352952
* [ASan] Do not instrument other runtime functions with `__asan_handle_no_return`Julian Lettner2019-02-021-2/+3
| | | | | | | | | | | | | | Summary: Currently, ASan inserts a call to `__asan_handle_no_return` before every `noreturn` function call/invoke. This is unnecessary for calls to other runtime funtions. This patch changes ASan to skip instrumentation for functions calls marked with `!nosanitize` metadata. Reviewers: TODO Differential Revision: https://reviews.llvm.org/D57489 llvm-svn: 352948
* [AutoUpgrade] Fix AutoUpgrade for x86.seh.recoverfpMandeep Singh Grang2019-02-021-4/+5
| | | | | | | | | | | | | | | | Summary: This fixes the bug in https://reviews.llvm.org/D56747#inline-502711. Reviewers: efriedma Reviewed By: efriedma Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57614 llvm-svn: 352945
* Revert "[BPF] [BTF] Process FileName with absolute path correctly"Yonghong Song2019-02-011-1/+1
| | | | | | | | This reverts commit r352939. Some tests failed. Revert to unblock others. llvm-svn: 352941
* [AArch64] Fix unused variable [NFC]Mandeep Singh Grang2019-02-011-0/+1
| | | | llvm-svn: 352940
* [BPF] [BTF] Process FileName with absolute path correctlyYonghong Song2019-02-011-1/+1
| | | | | | | | | | | | | | | | | In IR, sometimes the following attributes for DIFile may be generated: filename: /home/yhs/test.c directory: /tmp The /tmp may represent the working directory of the compilation process. In such cases, since filename is with absolute path, the directory should be ignored by BTF. The filename alone is enough to get the source. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 352939
* [CodeGen] Be as conservative about atomic accesses as for volatilePhilip Reames2019-02-013-2/+7
| | | | | | | | | | | | | | Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context. Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change. It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check. We should probably adjust that class hierarchy long term, but for now, that seperation is useful. I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests. Differential Revision: https://reviews.llvm.org/D57596 llvm-svn: 352937
* [WebAssembly] Add codegen support for the import_field attributeDan Gohman2019-02-015-16/+38
| | | | | | | | | This adds the LLVM side of https://reviews.llvm.org/D57602 -- the import_field attribute. See that patch for details. Differential Revision: https://reviews.llvm.org/D57603 llvm-svn: 352931
* [COFF, ARM64] Fix localaddress to handle stack realignment and variable size ↵Mandeep Singh Grang2019-02-017-32/+57
| | | | | | | | | | | | | | | | objects Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present. Reviewers: rnk, efriedma, ssijaric, TomTan Reviewed By: rnk, efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D57183 llvm-svn: 352923
* [X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mappingSimon Pilgrim2019-02-011-0/+4
| | | | | | | | Noticed in D57514. Differential Revision: https://reviews.llvm.org/D57519 llvm-svn: 352922
* [DebugInfo] Don't use realpath when looking up debug binary locations.Jordan Rupprecht2019-02-011-10/+1
| | | | | | | | | | | | | | | | | | | Summary: Using realpath makes assumptions about build systems that do not always hold true. The debug binary referred to from the .gnu_debuglink should exist in the same directory (or in a .debug directory, etc.), but the files may only exist as symlinks to a differently named files elsewhere, and using realpath causes that lookup to fail. This was added in r189250, and this is basically a revert + regression test case. Reviewers: dblaikie, samsonov, jhenderson Reviewed By: dblaikie Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D57609 llvm-svn: 352916
* [opaque pointer types] Pass function type for CallBase::setCalledFunction.James Y Knight2019-02-014-14/+14
| | | | | | Differential Revision: https://reviews.llvm.org/D57174 llvm-svn: 352914
* [opaque pointer types] Pass value type to GetElementPtr creation.James Y Knight2019-02-0131-165/+206
| | | | | | | | | This cleans up all GetElementPtr creation in LLVM to explicitly pass a value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57173 llvm-svn: 352913
* [opaque pointer types] Pass value type to LoadInst creation.James Y Knight2019-02-0163-289/+383
| | | | | | | | | This cleans up all LoadInst creation in LLVM to explicitly pass the value type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57172 llvm-svn: 352911
* [opaque pointer types] Pass function types to InvokeInst creation.James Y Knight2019-02-016-10/+12
| | | | | | | | | This cleans up all InvokeInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57171 llvm-svn: 352910
* [opaque pointer types] Pass function types to CallInst creation.James Y Knight2019-02-0141-134/+151
| | | | | | | | | This cleans up all CallInst creation in LLVM to explicitly pass a function type rather than deriving it from the pointer's element-type. Differential Revision: https://reviews.llvm.org/D57170 llvm-svn: 352909
* [InstCombine] Extra null-checking on TFE/LWE supportMichael Liao2019-02-011-4/+3
| | | | | | | | - If that operand is not ConstantInt, skip enabling TFE/LWE. Differential Revision: https://reviews.llvm.org/D57539 llvm-svn: 352904
* test commit (add blank line) NFCRoland Froese2019-02-011-0/+1
| | | | llvm-svn: 352897
* [DWARF v5] Fix DWARF emitter and consumer to produce/expect a uleb for a ↵Wolfgang Pieb2019-02-012-3/+6
| | | | | | | | | | location description's length. Reviewer: davide, JDevliegere Differential Revision: https://reviews.llvm.org/D57550 llvm-svn: 352889
* [AMDGPU] Fix for vector element insertionTim Corringham2019-02-011-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Incorrect code was generated when lowering insertelement operations for vectors with 8 or 16 bit elements. The value being inserted was not adjusted for the position of the element within the 32 bit word and so only the low element within each 32 bit word could receive the intended value. Fixed by simply replicating the value to each element of a congruent vector before the mask and or operation used to update the intended element. A number of affected LIT tests have been updated appropriately. before the mask & or into the intended Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Tags: #llvm Differential Revision: https://reviews.llvm.org/D57588 llvm-svn: 352885
* [SDAG] improve variable names; NFCSanjay Patel2019-02-011-23/+22
| | | | | | | | The version of FoldConstantArithmetic() that takes arbitrary nodes was confusingly naming those nodes as constants when they might not be; also "Cst" reads like "Cast". llvm-svn: 352884
* [X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffleSimon Pilgrim2019-02-011-0/+73
| | | | | | | | | | As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section. For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts). Differential Revision: https://reviews.llvm.org/D56784 llvm-svn: 352883
* [TargetLowering] try harder to determine undef elements of vector binopsSanjay Patel2019-02-011-7/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This might be the start of tracking all vector element constants generally if we take it to its logical conclusion, but let's stop here and make sure this is correct/beneficial so far. The affected tests require a convoluted path before they get simplified currently because we don't call SimplifyDemandedVectorElts() from binops directly and don't modify the binop operands directly in SimplifyDemandedVectorElts(). That's why the tests all have a trailing shuffle to induce a chain reaction of transforms. So something like this is happening: 1. Improve the knowledge of undefs in the binop via a SimplifyDemandedVectorElts() call that originates from a shuffle. 2. Transfer that undef knowledge back to the shuffle mask user as more undef lanes. 3. Combine the modified shuffle by calling SimplifyDemandedVectorElts() again. 4. Translate the improved shuffle mask as undemanded lanes of build vector constants causing those to become full undef constants. 5. Simplify the binop now that it has a full undef operand. As we can see from the unchanged 'and' and 'or' tests, tracking undefs alone isn't a full solution. We would need to track zero and all-ones constants to improve those opcodes. We'd probably need to track NaN for FP ops too (assuming we don't have fast-math-flags set). Differential Revision: https://reviews.llvm.org/D57066 llvm-svn: 352880
* [X86][AVX] Combine INSERT_SUBVECTOR(SRC0, ↵Simon Pilgrim2019-02-011-3/+4
| | | | | | | | | | BITCAST(SHUFFLE(EXTRACT_SUBVECTOR(SRC1))) Enable peeking through one use bitcasts to the subvector shuffle. This still depends on the subvector being the same scalar-size but D57514 has already helped with the more tricky patterns llvm-svn: 352879
* [InstCombine] reduce duplicate code; NFCSanjay Patel2019-02-011-2/+1
| | | | | | | | | An unused variable problem was introduced with rL352870 and stubbed out with rL352871, but we can make a better fix by actually using the local variable in code rather than just the assert. llvm-svn: 352873
* [InstCombine] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=offFangrui Song2019-02-011-0/+1
| | | | llvm-svn: 352871
* [InstCombine] try to reduce x86 addcarry to generic uaddo intrinsicSanjay Patel2019-02-011-0/+33
| | | | | | | | | | | | | | | | | | | If we can reduce the x86-specific intrinsic to the generic op, it allows existing simplifications and value tracking folds. AFAICT, this always results in identical x86 codegen in the non-reduced case...which should be true because we semi-generically (too aggressively IMO) convert to llvm.uadd.with.overflow in CGP, so the DAG/isel must already combine/lower this intrinsic as expected. This isn't quite what was requested in: https://bugs.llvm.org/show_bug.cgi?id=40486 ...but we want to have these kinds of folds early for efficiency and to enable greater simplifications. For the case in the bug report where we have: _addcarry_u64(0, ahi, 0, &ahi) ...this gets completely simplified away in IR. Differential Revision: https://reviews.llvm.org/D57453 llvm-svn: 352870
* [AArch64] Optimize floating point materializationAdhemerval Zanella2019-02-012-28/+23
| | | | | | | | | | | | | | | This patch changes isFPImmLegal to return if the value can be enconded as the immediate operand of a logical instruction besides checking if for immediate field for fmov. This optimizes some floating point materization, inclusive values used on isinf lowering. Reviewed By: rengolin, efriedma, evandro Differential Revision: https://reviews.llvm.org/D57044 llvm-svn: 352866
* [X86][BdVer2] Transfer delays from the integer to the floating point unit.Roman Lebedev2019-02-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I'm unable to find this number in the "AMD SOG for family 15h". llvm-exegesis measures the latencies of these instructions as `2`, which matches the latencies specified in "AMD SOG for family 15h". However if we look at Agner, Microarchitecture, "AMD Bulldozer, Piledriver, Steamroller and Excavator pipeline", "Data delay between different execution domains", the int->ivec transfer is listed as `8`..`10`cy of additional latency. Also, Agner's "Instruction tables", for Piledriver, lists their latencies as `12`, which is consistent with `2cy` from exegesis / AMD SOG + `10cy` transfer delay. Additional data point comes from the fact that Agner's "Instruction tables", for Jaguar, lists their latencies as `8`; and "AMD SOG for family 16h" does state the `+6cy` int->ivec delay, which is consistent with instr latency of `1` or `2`. Reviewers: andreadb, RKSimon, craig.topper Reviewed By: andreadb Subscribers: gbedwell, courbet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57300 llvm-svn: 352861
* Provide reason messages for unviable inliningYevgeny Rouban2019-02-013-17/+32
| | | | | | | | | | | | | InlineCost's isInlineViable() is changed to return InlineResult instead of bool. This provides messages for failure reasons and allows to get more specific messages for cases where callsites are not viable for inlining. Reviewed By: xbolva00, anemet Differential Revision: https://reviews.llvm.org/D57089 llvm-svn: 352849
* Revert r352750.James Henderson2019-02-011-27/+5
| | | | | | | This was causing a build bot failure: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/15346/ llvm-svn: 352848
* [CodeGen] Don't scavenge non-saved regs in exception throwing functionsOliver Stannard2019-02-011-7/+9
| | | | | | | | | | | | | | | | Previously, LiveRegUnits was assuming that if a block has no successors and does not return, then no registers are live at the end of it (because the end of the block is unreachable). This was causing the register scavenger to use callee-saved registers to materialise stack frame addresses without saving them in the prologue. This would normally be fine, because the end of the block is unreachable, but this is not legal if the block ends by throwing a C++ exception. If this happens, the scratch register will be modified, but its previous value won't be preserved, so it doesn't get restored by the exception unwinder. Differential revision: https://reviews.llvm.org/D57381 llvm-svn: 352844
* [SLPVectorizer] Get rid of IndexQueue array from vectorizeStores. NFCI.Yevgeny Rouban2019-02-011-27/+18
| | | | | | | | Indices are checked as they are generated. No need to fill the whole array of indices. Differential Revision: https://reviews.llvm.org/D57144 llvm-svn: 352839
* [RISCV] Implement RV64D codegenAlex Bradbury2019-02-012-4/+30
| | | | | | | | | | | | This patch: * Adds necessary RV64D codegen patterns * Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI) Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type. Differential Revision: https://reviews.llvm.org/D53237 llvm-svn: 352833
* [SelectionDAG] Support promotion of the FPOWI integer operandAlex Bradbury2019-02-012-0/+8
| | | | | | | | | | | | | | For targets where i32 is not a legal type (e.g. 64-bit RISC-V), LegalizeIntegerTypes must promote the integer operand of ISD::FPOWI. As this is a signed value, this should be sign-extended. This patch enables all tests in test/CodeGen/RISCVfloat-intrinsics.ll for RV64, as prior to this patch that file couldn't be compiled for RV64 due to an assertion when performing codegen for fpowi. Differential Revision: https://reviews.llvm.org/D54574 llvm-svn: 352832
OpenPOWER on IntegriCloud