summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* [LegalizeVectorOps] Split most of ExpandStrictFPOp into a separate ↵Craig Topper2020-01-041-6/+13
| | | | | | | | | | | UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT. ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT returned SDValue() if it wasn't able to handle and needed to unroll. Then ExpandStrictFPOp would detect his SDValue() and do the unroll. After this change, ExpandUINT_TO_FLOAT will directly call UnrollStrictFPOp and return the unrolled result.
* GlobalISel: Scalarize all division operationsMatt Arsenault2020-01-041-0/+3
| | | | | | This only handled G_SDIV, but they all are trivially scalarizable. Also define placeholder AMDGPU division legalizer rules.
* Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)."Florian Hahn2020-01-041-1/+1
| | | | | This reverts commit 51ef53f3bd23559203fe9af82ff2facbfedc1db3, as it breaks some bots.
* [SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC).Florian Hahn2020-01-041-1/+1
| | | | | | | | | | | | SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537
* GlobalISel: Define G_READCYCLECOUNTERMatt Arsenault2020-01-041-0/+2
|
* [TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits ↵Simon Pilgrim2020-01-041-0/+11
| | | | | | | | | | | | | | for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887
* GlobalISel: Add type argument to getRegBankFromRegClassMatt Arsenault2020-01-031-7/+13
| | | | | | AMDGPU can't unambiguously go back from the selected instruction register class to the register bank without knowing if this was used in a boolean context.
* [DAGCombiner] fix miscompile in translating (X & undef) to shuffleSanjay Patel2020-01-031-1/+3
| | | | | See PR42982 for more context: https://bugs.llvm.org/show_bug.cgi?id=42982
* [LegalizeVectorOps] Pass the post-UpdateNodeOperands version of Op to ↵Craig Topper2020-01-031-11/+14
| | | | | | | | | | ExpandLoad/ExpandStore UpdateNodeOperands might CSE to another existing node. So we should make sure we're legalizing that node otherwise we might fail to hook up the operands properly. I've moved the result registration up to the caller to avoid having to pass both Result and Op into the functions where it might be confusing which is which. This address 2 other issues pointed out in D71861. Differential Revision: https://reviews.llvm.org/D72021
* Move tail call disabling code to target independent codeReid Kleckner2020-01-034-8/+28
| | | | | | | | | | | | | | | | | When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in d9699bc7bdf0362173fcd256690f61a4d47429c2 (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118
* [DAGCombiner][X86][AArch64] Generalize `A-(A&B)`->`A&(~B)` fold (PR44448)Roman Lebedev2020-01-031-20/+9
| | | | | | | | | | | | | | | | | | | | | | | The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in 8dab0a4a7d691f2704f1079538e0ef29548db159 is too specific. It should/can just be 'A - (A & B)' -> 'A & (~B)' Even if we don't manage to fold `~` into B, we have likely formed `ANDN` node. Also, this way there's less similar-but-duplicate folds. Name: X - (X & Y) -> X & (~Y) %o = and i32 %X, %Y %r = sub i32 %X, %o => %n = xor i32 %Y, -1 %r = and i32 %X, %n https://rise4fun.com/Alive/kOUl See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* [DAGCombiner] `~(add X, -1)` -> `neg X` foldRoman Lebedev2020-01-031-0/+7
| | | | | | | | | | | | | | | The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in 8dab0a4a7d691f2704f1079538e0ef29548db159 is too specific. It should just be 'A - (A & B)' -> 'A & (~B)', but we currently fail to sink that '~' into `(B - 1)`. Name: ~(X - 1) -> (0 - X) %o = add i32 %X, -1 %r = xor i32 %o, -1 => %r = sub i32 0, %X https://rise4fun.com/Alive/rjU
* [DAGCombine][X86][Thumb2/LowOverheadLoops] `A - (A & C)` -> `A & (~C)` fold ↵Roman Lebedev2020-01-031-0/+10
| | | | | | | | | | | | | | | | | | | | | (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. Name: PR44448 ptr - (ptr & C) -> ptr & (~C) %bias = and i32 %ptr, C %r = sub i32 %ptr, %bias => %r = and i32 %ptr, ~C See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* [NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' foldRoman Lebedev2020-01-031-1/+1
|
* Fix typo "psuedo" in commentsJay Foad2020-01-031-1/+1
|
* [DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448)Roman Lebedev2020-01-031-0/+15
| | | | | | | | | | | | | | | | | | | | | | | While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499
* [DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG ↵QingShan Zhang2020-01-031-0/+1
| | | | | | | | | | | for vector type as 'expand' instead of 'legal' For now, we didn't set the default operation action for SIGN_EXTEND_INREG for vector type, which is 0 by default, that is legal. However, most target didn't have native instructions to support this opcode. It should be set as expand by default, as what we did for ANY_EXTEND_VECTOR_INREG. Differential Revision: https://reviews.llvm.org/D70000
* DAG: Use TargetConstant for FENCE operandsMatt Arsenault2020-01-021-4/+4
|
* [SelectionDAG] Simplify SelectionDAGBuilder::visitInlineAsmFangrui Song2020-01-021-3/+1
|
* [FPEnv] Default NoFPExcept SDNodeFlag to falseUlrich Weigand2020-01-025-10/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization. This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again. In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception. To check whether or not an SD node should be considered as raising an FP exception, the following logic applies: - For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71841
* [NFC] Add explicit instantiation to releaseNodeQiu Chaofan2020-01-021-0/+5
| | | | | | Resolve a build failure about undefined symbols introduced by f9f78cf. Differential Revision: https://reviews.llvm.org/D72069
* [RegisterClassInfo] Use SmallVector::assign instead of resize to make sure ↵Craig Topper2020-01-011-1/+1
| | | | | | | | | | | | we erase previous contents from all entries of the vector. resize only writes to elements that get added. Any elements that already existed maintain their previous value. In this case we're trying to erase cached information so we should use assign which will write to every element. Found while trying to add new tests to an existing X86 test and noticed register allocation changing in other functions.
* [MachineScheduler] improve reuse of 'releaseNode'methodLorenzo Casalino2020-01-011-17/+21
| | | | | | | | | | | | | | | | The 'SchedBoundary::releaseNode' is merely invoked for releasing the Top/Bottom root nodes. However, 'SchedBoundary::releasePending' uses its same logic to check if the Pending queue has any releasable SUnit. It is possible to slightly modify the body of the two, allowing re-use of the former ('releaseNode') in the latter. Patch by Lorenzo Casalino <lorenzo.casalino93@gmail.com> Reviewers: MatzeB, fhahn, atrick Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65506
* [NFC] Fixes -Wrange-loop-analysis warningsMark de Wever2020-01-015-10/+10
| | | | | | This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857
* DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nszMatt Arsenault2019-12-312-5/+10
| | | | | | | | | | | | | | This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.
* [LegalizeVectorOps][AArch64] Stop asking for v4f16 fp_round and fp_extend to ↵Craig Topper2019-12-311-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | be promoted. These operations are needed as building blocks for promoting so they can't be promoted themselves. This appeared to work because the fp_extend query type for operation actions is the result type, not the input type so it never triggered in the legalizer. For fp_round, the vector op legalizer just ended up creating a nop fp_extend that was elided by getNode, followed by a nop fp_round that was also elided by getNode. This was followed by a final fp_round from v4f32 back to vf416 which was CSEd to the original node. Then legalize vector ops just believed that node legalized to itself. LegalizeDAG took another crack at promoting it, but didn't have a handler so just skipped it with a debug message saying it wasn't promoted. This patch just removes the operation actions to avoid this non-sense. Found while trying to refactor LegalizeVectorOps to handle multiple result nodes better.
* [ARM][TypePromotion] Re-enable by defaultSam Parker2019-12-311-1/+1
| | | | Re-enable the pass after it was reverted and the bug fixed.
* [TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues ↵Craig Topper2019-12-302-16/+6
| | | | | | | | | | | instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.
* Ignore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor ↵Fangrui Song2019-12-301-13/+1
| | | | | | | | | | | | | of "frame-pointer" D56351 (included in LLVM 8.0.0) introduced "frame-pointer". All tests which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf" have been migrated to use "frame-pointer". Implement UpgradeFramePointerAttributes to upgrade the two obsoleted function attributes for bitcode. Their semantics are ignored. Differential Revision: https://reviews.llvm.org/D71863
* [MIPS GlobalISel] Select bitreverse. RecommitPetar Avramovic2019-12-301-1/+46
| | | | | | | | | | | | | | | G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Recommit notes: Introduce temporary variables in order to make sure instructions get inserted into MachineFunction in same order regardless of compiler used to build llvm. Differential Revision: https://reviews.llvm.org/D71363
* GlobalISel: moreElementsVector for FP min/maxMatt Arsenault2019-12-301-1/+7
|
* Revert "[MIPS GlobalISel] Select bitreverse"Dmitri Gribenko2019-12-301-45/+1
| | | | | | This reverts commit dbc136e0fe7e14c64dcb78e72321bb41af60afa4. It broke buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066
* [MIPS GlobalISel] Select bitreversePetar Avramovic2019-12-301-1/+45
| | | | | | | | | | G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Differential Revision: https://reviews.llvm.org/D71363
* [MIPS GlobalISel] Select bswapPetar Avramovic2019-12-301-0/+58
| | | | | | | | | G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates these intrinsics from __builtin_bswap32 and __builtin_bswap64. Add lower and narrowscalar for G_BSWAP. Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later. Differential Revision: https://reviews.llvm.org/D71362
* [MCP] Add stats for backward copy propagation. NFC.Kai Luo2019-12-301-1/+5
|
* [SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsmFangrui Song2019-12-291-11/+3
| | | | | | Indirect C_Immediate or C_Other constraints have been excluded. Also simplify an unneeded change to indirect 'X' by D60942.
* [SelectionDAG] Disallow indirect "i" constraintFangrui Song2019-12-291-0/+6
| | | | | | | | | This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.
* SimplifyDemandedBits - Remove duplicate getOperand() call. NFC.Simon Pilgrim2019-12-281-9/+7
| | | | Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.
* [TargetLowering] Update comment to reference the correct compiler-rt ↵Craig Topper2019-12-271-1/+1
| | | | function the code is based on. NFC
* Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removedFangrui Song2019-12-272-16/+0
|
* TailDuplication: Clear NoPHIs propertyMatt Arsenault2019-12-271-0/+5
| | | | | | The early tail duplicator pass introduces new ones, so a MIR test that infers no phis since there were none on the input would fail the verifier after running.
* Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821Fangrui Song2019-12-273-36/+0
| | | | | | | Intrinsic has incorrect argument type! i32 (i32*)* @llvm.setjmp *wipes tear*
* [X86][FPEnv] Promote some float strictfp operations to double on ↵Craig Topper2019-12-261-0/+31
| | | | | | | | i686-pc-windows-msvc to match what we do for non-strict. The float libcalls are inlined in MSVC's math header where they just cast to double and use the double libcall. Do the same when we emit libcalls.
* [DebugInfo][SelectionDAG] Change order while transferring SDDbgValue to ↵Kristina Bessonova2019-12-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | another node SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to another node, but doesn't change its source order. If the destination node has the order greater than the SDDbgValue, there are two possible issues revealed later: * If debug info is attached to an instruction that is the first definition of a register, this ends up with a def-after-use and the debug info gets 'undef' later. * If MIR has another definition of a register above the debug info, the debug info may represent a source variable incorrectly because it appears (significantly) before an instruction corresponded to this debug info. So, the patch changes the order of an SDDbgValue when it is moved to a node with greater order. Reviewers: dblaikie, jmorse, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71175
* [X86] Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backendWang, Pengfei2019-12-261-3/+12
| | | | | | | | | | | | Summary: Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Reviewers: craig.topper, RKSimon, LiuChen3, uweigand, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71871
* GlobalISel: Update syntax in debug printingMatt Arsenault2019-12-241-1/+1
| | | | Physical register names now start with $, not %
* GlobalISel: Fix naming variables "brank" instead of "bank"Matt Arsenault2019-12-241-7/+7
|
* [TypePromotion] Make TypeSize a class memberSam Parker2019-12-241-87/+98
| | | | | | | | | Having TypeSize as a static class variable was causing problems with multi-threading. Several static functions have now been converted into methods of TypePromotion and a few other members of TypePromotion and IRPromoter have been added or removed. Differential Revision: https://reviews.llvm.org/D71832
* DebugInfo: Correct the form of DW_AT_macro_info in .dwo files (sec_offset, ↵David Blaikie2019-12-241-1/+1
| | | | rather than data4)
* DebugInfo: Add {} to address -Wdangling-else warning.David Blaikie2019-12-241-1/+2
|
OpenPOWER on IntegriCloud