summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [MIPS GlobalISel] Select bitreversePetar Avramovic2019-12-302-1/+49
| | | | | | | | | | G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Differential Revision: https://reviews.llvm.org/D71363
* [MIPS GlobalISel] Select bswapPetar Avramovic2019-12-303-0/+72
| | | | | | | | | G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates these intrinsics from __builtin_bswap32 and __builtin_bswap64. Add lower and narrowscalar for G_BSWAP. Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later. Differential Revision: https://reviews.llvm.org/D71362
* [MCP] Add stats for backward copy propagation. NFC.Kai Luo2019-12-301-1/+5
|
* [Attributor] Use `changeUseAfterManifest` in AAValueSimplify manifestHideto Ueno2019-12-301-4/+2
| | | | | | | | | | | | | | Summary: This patch makes `AAValueSimplify` use `changeUsesAfterManifest` in `manifest`. This will invoke simple folding after the manifest. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71972
* [SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsmFangrui Song2019-12-291-11/+3
| | | | | | Indirect C_Immediate or C_Other constraints have been excluded. Also simplify an unneeded change to indirect 'X' by D60942.
* [PowerPC] Exploit the rlwinm instructions for "and" with constantQingShan Zhang2019-12-302-0/+44
| | | | | | | | | | | | | | | | | | | | For now, PowerPC will using several instructions to get the constant and "and" it with the following case: define i32 @test1(i32 %a) { %and = and i32 %a, -2 ret i32 %and } However, we could exploit it with the rotate mask instructions. MB ME +----------------------+ |xxxxxxxxxxx00011111000| +----------------------+ 0 32 64 Notice that, we can only do it if the MB is larger than 32 and MB <= ME as RLWINM will replace the content of [0 - 32) with [32 - 64) even we didn't rotate it. Differential Revision: https://reviews.llvm.org/D71829
* [X86] Use APInt::isOneValue and ConstantSDNode::isOne. NFCCraig Topper2019-12-291-4/+4
| | | | | These are implemented slightly more efficiently than comparing to 1 in the case that the value is more than 64 bits.
* [X86] Use isOneConstant to simplify some code. NFCCraig Topper2019-12-291-2/+1
|
* [X86] Remove dyn_casts to ConstantSDNode for operand 1 of ↵Craig Topper2019-12-291-108/+99
| | | | | | | | X86ISD::VSRLI/VSRAI/VSRLI. Use getConstantOperandVal and APInt operations. These nodes should only ever be formed with an i8 TargetConstant so we don't need to check for it to be a constant. It's also always 8-bits so we don't need to use APInt compare functions.
* [SelectionDAG] Disallow indirect "i" constraintFangrui Song2019-12-2912-22/+7
| | | | | | | | | This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.
* [Attributor] AAUndefinedBehavior: Check for branches on undef value.Hideto Ueno2019-12-291-53/+139
| | | | | | | | | | | | | | | A branch is considered UB if it depends on an undefined / uninitialized value. At this point this handles simple UB branches in the form: `br i1 undef, ...` We query `AAValueSimplify` to get a value for the branch condition, so the branch can be more complicated than just: `br i1 undef, ...`. Patch By: Stefanos Baziotis (@baziotis) Reviewers: jdoerfert, sstefan1, uenoku Reviewed By: uenoku Differential Revision: https://reviews.llvm.org/D71799
* [X86] Stop accidentally custom type legalizing v4i32->v4f32 on SSE1 only ↵Craig Topper2019-12-281-2/+3
| | | | | | | | | targets. We had a Custom operation action for v4i32 on SSE1. But since v4i32 isn't legal until SSE2 this was not what was intended. The code that get executed was intended for op legalization and creates a bunch of v4i32 nodes that all end up scalarized.
* [LV] Use getMask() when printing recipe [NFCI]Gil Rapaport2019-12-292-3/+4
| | | | | | Use dedicated API for getting the mask instead of duplicating it. Differential Revision: https://reviews.llvm.org/D71964
* [X86] Remove a redundant (scalar_to_vector (extract_vector_elt X))) in ↵Craig Topper2019-12-281-6/+1
| | | | LowerUINT_TO_FP_i32. NFCI
* [X86] Fix -enable-machine-outliner for x86-32 after D48683Fangrui Song2019-12-281-3/+1
| | | | D48683 accidentally disabled -enable-machine-outliner for x86-32.
* Revert "[COFF] Make the autogenerated .weak.<name>.default symbols static"Martin Storsjö2019-12-281-13/+9
| | | | | | | This reverts commit 7ca86ee6494d4307333b300bae80e42df4a5140f. Apparently this change causes MS link.exe to error out with "LNK1235: corrupt or invalid COFF symbol table".
* [COFF] Make the autogenerated .weak.<name>.default symbols staticMartin Storsjö2019-12-281-9/+13
| | | | | | | | | | | | | | If we have references to the same extern_weak in multiple objects, all of them would generate external symbols with the same name. Make them static to avoid duplicate definitions; nothing should need to refer to this symbol outside of the current object. GCC/binutils seems to handle the same by not using a fixed string for the ".default" suffix, but instead using the name of some other defined external symbol from the same object (which is supposed to be unique among objects unless there's other duplicate definitions). Differential Revision: https://reviews.llvm.org/D71711
* Fix bots after a9ad65a2b34fNemanja Ivanovic2019-12-281-0/+1
| | | | | In the last commit, I neglected to initialize the new subtarget feature I added which caused failures on a few bots. This should fix that.
* [PowerPC] Change default for unaligned FP access for older subtargetsNemanja Ivanovic2019-12-283-1/+10
| | | | | | | | | | | This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554 Some CPU's trap to the kernel on unaligned floating point access and there are kernels that do not handle the interrupt. The program then fails with a SIGBUS according to the PR. This just switches the default for unaligned access to only allow it on recent server CPUs that are known to allow this. Differential revision: https://reviews.llvm.org/D71954
* SimplifyDemandedBits - Remove duplicate getOperand() call. NFC.Simon Pilgrim2019-12-281-9/+7
| | | | Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.
* [PowerPC] Modify the hasSideEffects of some VSX instructions from 1 to 0Kang Zhang2019-12-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: If we didn't set the value for hasSideEffects bit in our td file, `llvm-tblgen` will set it as true for those instructions which has no match pattern. Below 6 instructions don't set the hasSideEffects flag and don't have match pattern, so their hasSideEffects flag will be set true by llvm-tblgen. But in fact below instructions don't modify any special register and don't have other SideEffects, they shouldn't have SideEffects. This patch is to modify the hasSideEffects of below instructions from 1 to 0. ``` VEXTUHLX VEXTUHRX VEXTUWLX VEXTUWRX VSPLTBs VSPLTHs ``` Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D71391
* [TargetLowering] Update comment to reference the correct compiler-rt ↵Craig Topper2019-12-271-1/+1
| | | | function the code is based on. NFC
* Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removedFangrui Song2019-12-272-16/+0
|
* DebugInfo: Fix rangesBaseAddress DICompileUnit bitcode ↵David Blaikie2019-12-273-3/+4
| | | | | | serialization/deserialization Follow-up to r346788 review feedback from Adrian Prantl.
* AMDGPU/GlobalISel: Use SReg_32 for readfirstlane constrainingMatt Arsenault2019-12-271-1/+1
| | | | | This matches the DAG behavior where we don't use SReg_32_XM0 everywhere anymore, and fixes not coalescing the copies into m0.
* Hexagon: Fix missing tablegen mode commentMatt Arsenault2019-12-271-1/+1
|
* TII: Fix using Register for a subregister index argumentMatt Arsenault2019-12-272-2/+2
|
* AMDGPU: Use RegisterMatt Arsenault2019-12-271-9/+9
|
* TailDuplication: Clear NoPHIs propertyMatt Arsenault2019-12-271-0/+5
| | | | | | The early tail duplicator pass introduces new ones, so a MIR test that infers no phis since there were none on the input would fail the verifier after running.
* [ConstantRange] Respect destination bitwidth for cast results.Florian Hahn2019-12-271-2/+2
| | | | | | | | | | | We returning a full set, we should use ResultBitWidth. Otherwise we might it assertions when the resulting constant ranges are used later on. Reviewers: nikic, spatel, reames Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D71937
* [Matrix] Propagate and use shape info for binary operators.Florian Hahn2019-12-271-2/+74
| | | | | | | | | | | | | This patch extends the current shape propagation and shape aware lowering to also support binary operators. Those operators are uniform with respect to their shape (shape of the input operands is the same as the shape of their result). Reviewers: anemet, Gerolf, reames, hfinkel, andrew.w.kaylor Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D70898
* [NFC][DA] Remove duplicate code in checkSrcSubscript and checkDstSubscriptDanilo Carvalho Grael2019-12-271-25/+16
| | | | | | | | | | | | | | | | | | | Summary: [DA] Move common code in checkSrcSubscript and checkDstSubscript to a new function checkSubscript. This avoids duplicate code and possible out of sync in the future. Reviewers: sebpop, jmolloy, reames Reviewed By: sebpop Subscribers: bmahjour, hiraditya, llvm-commits, amehsan Tags: #llvm Differential Revision: https://reviews.llvm.org/D71087 Patch by zhongduo.
* AMDGPU/GlobalISel: Fix extra result register in fdiv64 loweringMatt Arsenault2019-12-271-2/+1
| | | | | | There ended up being two result registers, which would fail on select. It was really defing a new temp register in the correct def position, instead of the correct result register.
* AMDGPU/GlobalISel: Select some 128-bit load/storesMatt Arsenault2019-12-271-4/+10
|
* AMDGPU: Use correct DebugLocMatt Arsenault2019-12-271-1/+1
|
* [X86] Allow v2i32->v2f32 strict and non-strict uint_to_fp to be widened to ↵Craig Topper2019-12-271-1/+1
| | | | | | | | v4i32->v4f32 under avx512. With avx512vl we get v4i32->v4f32 uint_to_fp instructions. With avx512f we get v16i32->v16f32 instructions which we can use to emulate v4i32->v4f32.
* [X86] Custom widen v2i32->v2f32 strict_sint_to_fp to avoid scalarization.Craig Topper2019-12-271-3/+19
|
* Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821Fangrui Song2019-12-277-72/+1
| | | | | | | Intrinsic has incorrect argument type! i32 (i32*)* @llvm.setjmp *wipes tear*
* [X86] Custom widen 128/256-bit vXi32 fp_to_uint on avx512f targets without ↵Craig Topper2019-12-263-64/+92
| | | | | | | | | | | | | | | | | | | | | | avx512vl. Similar for vXi64 on avx512dq without avx512vl. Summary: Previously we did this with isel patterns that used garbage in the widened part of the source. But that's not valid for strictfp. So now we custom widen and use zeroes for the widened elemens for strictfp. This replaces D71864. Reviewers: RKSimon, spatel, andrew.w.kaylor, pengfei, LiuChen3 Reviewed By: pengfei Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71879
* [X86] Custom widen strict v2f32->v2i32 by padding with zeroes.Craig Topper2019-12-261-0/+12
| | | | | For non-strict, generic type legalization will take care of this, but that doesn't happen currently for strict nodes.
* [X86] Fix -Wmisleading-indentation after D71892Fangrui Song2019-12-261-0/+1
|
* [X86][FPEnv] Promote some float strictfp operations to double on ↵Craig Topper2019-12-262-2/+40
| | | | | | | | i686-pc-windows-msvc to match what we do for non-strict. The float libcalls are inlined in MSVC's math header where they just cast to double and use the double libcall. Do the same when we emit libcalls.
* [X86] Add custom legalization for strict_uint_to_fp v2i32->v2f32.Craig Topper2019-12-261-7/+16
| | | | | | | | I believe the algorithm we use for non-strict is exception safe for strict. The fsub won't generate any exceptions. After it we will have an exact version of the i32 integer in a double. Then we just round it to f32. That rounding will generate a precision exception if it can't be represented exactly.
* add custom operation for strict fpextend/fproundLiu, Chen32019-12-275-20/+57
| | | | Differential Revision: https://reviews.llvm.org/D71892
* Remove SrcVT only used in an assert and propagate query.Eric Christopher2019-12-261-2/+2
|
* [X86] Custom widen 128/256-bit vXi32 uint_to_fp on avx512f targets without ↵Craig Topper2019-12-262-62/+106
| | | | | | | | avx512vl. Similar for vXi64 sint_to_fp/uint_to_fp on avx512dq without avx512vl. Previously we widened these through isel patterns, but that didn't work for STRICT_ nodes. Those need to be padded with zeroes in the upper bits which is harder to do in isel patterns.
* [X86] Add custom widening for v2i32->v2f64 strict_uint_to_fp with AVX512F, ↵Craig Topper2019-12-262-10/+23
| | | | | | | | | | | | | | | but not AVX512VL. Previously we were widening with isel patterns, but that wasn't exception safe for strict FP. So now we widen to v4i32->v4f64 during type legalization. And then let op legalization further widen to v8i32->v8f64. The vec_int_to_fp.ll changes are caused by us no longer narrowing extracts of strict_uint_to_fp to the v4i32->v2f64 instruction without AVX512VL only to have isel rewiden it. Now we just keep it wide throughout. So we don't have an opportunity to narrow the load.
* [X86] Add custom widening for v2f64->v2i32 strict_fp_to_uint with avx512f, ↵Craig Topper2019-12-261-6/+15
| | | | | | | | | | | | | but not avx512vl. AVX512F added instruction for vector fp_to_uint conversions. With AVX512VL we can use a specific instruction that does v2f64->v4i32 with zeroes in the 2 extra elements. For non-strict nodes without AVX512VL we relied on type legalization to turn it to v4f64->v4i32 which would later be widened by op legalization to v8f64->v8i32. But type legalization doesn't currently widen strict nodes since it doesn't know how to safely and efficiently pad the extra elements. But for X86 we know padding with zeroes is safe and efficient so do that ourselves.
* [DebugInfo][SelectionDAG] Change order while transferring SDDbgValue to ↵Kristina Bessonova2019-12-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | another node SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to another node, but doesn't change its source order. If the destination node has the order greater than the SDDbgValue, there are two possible issues revealed later: * If debug info is attached to an instruction that is the first definition of a register, this ends up with a def-after-use and the debug info gets 'undef' later. * If MIR has another definition of a register above the debug info, the debug info may represent a source variable incorrectly because it appears (significantly) before an instruction corresponded to this debug info. So, the patch changes the order of an SDDbgValue when it is moved to a node with greater order. Reviewers: dblaikie, jmorse, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71175
* [Attributor] Add helper to change an instruction to `unreachable` instHideto Ueno2019-12-271-5/+4
| | | | | | | | | | | | | | Summary: Calling `changeToUnreachable` in `manifest` from different places might cause really unpredictable problems. As other deleting functions are doing, we need to change these instructions after all `manifest`. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71910
OpenPOWER on IntegriCloud