|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| | We print a debug message when most nodes are created, but getVectorShuffle was missing.
llvm-svn: 319085 | 
| | 
| 
| 
| 
| 
| | operands have already been split.
llvm-svn: 319010 | 
| | 
| 
| 
| | llvm-svn: 319009 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Currently ScalarizeVecRes_SETCC checks for the result type being a vector and jumps to ScalarizeVecRes_VSETCC. But if we're scalarizing a vector result, aren't we guaranteed to be looking at a vector type?
This patch deletes the current ScalarizeVecRes_SETCC and renames  ScalarizeVecRes_VSETCC to ScalarizeVecRes_SETCC.
Reviewers: RKSimon, arsenm, eladcohen, zvi
Reviewed By: RKSimon
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D40452
llvm-svn: 318982 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch reverts change to X86TargetLowering::getScalarShiftAmountTy in
rL318727 and move the logic to DAGTypeLegalizer::SplitInteger.
The reason is that getScalarShiftAmountTy returns a shift amount type that
is suitable for common use cases in CodeGen. DAGTypeLegalizer::SplitInteger
is a rare situation which requires a shift amount type larger than what
getScalarShiftAmountTy. In this case, it is more reasonable to do special
handling of shift amount type in DAGTypeLegalizer::SplitInteger only. If
similar situations arises the logic may be moved to a separate function.
Differential Revision: https://reviews.llvm.org/D40320
llvm-svn: 318890 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Since i1 is a legal type, this:
  NumBytes = Op1->getMemoryVT().getSizeInBits() >> 3;
is wrong and should be instead
  NumBytes = Op0->getMemoryVT().getStoreSize();
There seems to be more places where this should be fixed outside DAGCombiner.
Review: Hal Finkel
https://bugs.llvm.org/show_bug.cgi?id=35366
llvm-svn: 318824 | 
| | 
| 
| 
| 
| 
| 
| 
| | than result 0.
I plan to use this to check the type of the mask result of masked gathers in the X86 backend.
llvm-svn: 318820 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | DAGTypeLegalizer::SplitInteger uses default pointer size as shift amount constant type,
which causes less performant ISA in amdgcn---amdgiz target since the default pointer
type is i64 whereas the desired shift amount type is i32.
This patch fixes that by using TLI.getScalarShiftAmountTy in DAGTypeLegalizer::SplitInteger.
The X86 change is necessary since splitting i512 requires shifting amount of 256, which
cannot be held by i8.
Differential Revision: https://reviews.llvm.org/D40148
llvm-svn: 318727 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | the condition to the SetCC type for the final result type not the original type.
Normally this would be cleaned up by promoting the condition operand next. But in the attached case we promoted the result from v2i48 to v2i64 and the condition from v2i1 to v2i48. Then we tried to "promote" the v2i48 condition back to v2i1 because that's what the SetCC result type for v2i64 is on X86 with VLX. But promote is either a NOP or SIGN_EXTEND and this would need a truncation.
With the change here we now get the SetCC type of v2i1 when we're handling the result promotion and the operand no longer needs to be promoted itself.
Fixes PR35272.
llvm-svn: 318706 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | non-deterministic ordering"
This broke the bots. Reverting this until I can fix the failures.
This reverts commit 5a3db2856d12a3c4b400f487d39f8f05989e79f0.
llvm-svn: 318686 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | ordering
Summary:
This fixes failures in the following tests uncovered by D39245:
        LLVM :: CodeGen/ARM/ifcvt3.ll
        LLVM :: CodeGen/ARM/switch-minsize.ll
        LLVM :: CodeGen/X86/switch.ll
Reviewers: hans, efriedma
Reviewed By: hans
Subscribers: fhahn, aemerson, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D39995
llvm-svn: 318680 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | handle nodes with chain outputs.
Previously we were assuming all results were vectors and calling SetWidenedVector, but if its a chain result we should just replace uses instead.
This fixes an error found by expensive checks after r318368.
llvm-svn: 318509 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).
Both functions should be doing the exact same thing. This patch
consolidates the logic into one place.
This was reverted in r318455 because some newly introduced asserts,
which I thought were NFC, were firing. I filed PR35338. For now I've
weakened the asserts.
Testing: check-llvm, check-clang, and a stage2 Rel+Deb build of clang
Differential Revision: https://reviews.llvm.org/D40104
llvm-svn: 318498 | 
| | 
| 
| 
| 
| 
| 
| 
| | All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
llvm-svn: 318490 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | SelectionDAGBuilder.
The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal.
It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64.
I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities.
This should fix PR30690.
llvm-svn: 318466 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit r318448. It looks like some of the asserts need to
be weakened.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/16296
llvm-svn: 318455 | 
| | 
| 
| 
| | llvm-svn: 318450 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | TransferDbgValues (capital 'T') is wired into ReplaceAllUsesWith, and
transferDbgValues (lowercase 't') is used elsewhere (e.g in Legalize).
Both functions should be doing the exact same thing. This patch
consolidates the logic into one place.
Differential Revision: https://reviews.llvm.org/D40104
llvm-svn: 318448 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is
incorrect for triple amdgcn---amdgiz and causes isel failure.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40095
llvm-svn: 318392 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Change the calculation for the desired ValueType for non-sign
extending loads, as in those cases we don't care about the
higher bits. This creates a smaller ExtVT and allows for such
combinations as:
(srl (zextload i16, [addr]), 8) -> (zextload i8, [addr + 1])
Differential Revision: https://reviews.llvm.org/D40034
llvm-svn: 318390 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | code that can be reached if targets don't configure things correctly.
For example, this is currently reachable by X86 if you use a masked store intrinsic with a v1iX type.
Using a fatal error seems like a better user experience if someone were to encounter this on a release build. There are several other similar places that have been converted from unreachable to fatal error previously.
llvm-svn: 318379 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | processDbgDeclares assumes pointer size is the same for different addr spaces.
It uses pointer size for addr space 0 for all pointers, which causes assertion
in stripAndAccumulateInBoundsConstantOffsets for amdgcn---amdgiz since
pointer in addr space 5 has different size than in addr space 0.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D40085
llvm-svn: 318370 | 
| | 
| 
| 
| 
| 
| 
| | Due to integer precision, we might have numerator greater than denominator in
the branch probability scaling. Add a check to prevent this from happening.
llvm-svn: 318353 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch peels off the top case in switch statement into a branch if the
probability exceeds a threshold. This will help the branch prediction and
avoids the extra compares when lowering into chain of branches.
Differential Revision: http://reviews.llvm.org/D39262
llvm-svn: 318202 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | TargetLowering::LowerCallTo assumes that sret value type corresponds to a
pointer in default address space, which is incorrect, since sret value type
should correspond to a pointer in alloca address space, which may not
be the default address space. This causes assertion for amdgcn target
in amdgiz environment.
This patch fixes that.
Differential Revision: https://reviews.llvm.org/D39996
llvm-svn: 318167 | 
| | 
| 
| 
| 
| 
| 
| 
| | when transferring debug info describing the lower bits of an extended SDNode.
rdar://problem/35504722
llvm-svn: 318086 | 
| | 
| 
| 
| 
| 
| 
| 
| | middle GEP indices are non-constant.
This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.
llvm-svn: 317950 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | handling to accept GEPs with more than 2 operands if the middle operands are all 0s
Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows.
This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted.
We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example.
Differential Revision: https://reviews.llvm.org/D39911
llvm-svn: 317947 | 
| | 
| 
| 
| 
| 
| | folding, this trigger needless extra rounds of combine for nothing. NFC
llvm-svn: 317926 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: Fixes PR35220
Reviewers: vadimcn, alexcrichton
Reviewed By: alexcrichton
Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin
Differential Revision: https://reviews.llvm.org/D39866
llvm-svn: 317895 | 
| | 
| 
| 
| 
| 
| | rdar://problem/27139077
llvm-svn: 317825 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].
Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.
As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.
[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html
Differential Revision: https://reviews.llvm.org/D38336
llvm-svn: 317729 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In 2010 a commit with no testcase and no further explanation
explicitly disabled the handling of inlined variables in
EmitFuncArgumentDbgValue(). I don't think there is a good reason for
this any more and re-enabling this adds debug locations for variables
associated with an LLVM function argument in functions that are
inlined into the first basic block. The only downside of doing this is
that we may insert a DBG_VALUE before the inlined scope, but (1) this
could be filtered out later, and (2) LiveDebugValues will not
propagate it into subsequent basic blocks if they don't dominate the
variable's lexical scope, so this seems like a small price to pay.
rdar://problem/26228128
llvm-svn: 317702 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Some of the AMDGPU stack addressing modes require knowing the sign
bit is zero. We used to accomplish this by custom lowering
frame indexes, and then putting an AssertZext around a
TargetFrameIndex. This required specifically looking for
the AssextZext + frame index pattern which was moderately
disgusting. The same could probably be accomplished
with a target specific node, but would still
require special handling of frame indexes.
llvm-svn: 317671 | 
| | 
| 
| 
| 
| 
| 
| 
| | This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.
llvm-svn: 317647 | 
| | 
| 
| 
| | llvm-svn: 317588 | 
| | 
| 
| 
| 
| 
| 
| | We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.
llvm-svn: 317534 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html
...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.
As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.
We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).
...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.
We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.
Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.
Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.
Differential Revision: https://reviews.llvm.org/D39304
llvm-svn: 317488 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.
This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.
llvm-svn: 317379 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | When splitting a large load to smaller legally-typed loads, the last load should be padded to reach the size of the previous one so a CONCAT_VECTORS node could reunite them again.
The code currently pads the last load to reach the size of the first load (instead of the previous).
Differential Revision: https://reviews.llvm.org/D38495
Change-Id: Ib60b55ed26ce901fabf68108daf52683fbd5013f
llvm-svn: 317206 | 
| | 
| 
| 
| 
| 
| | input. NFCI.
llvm-svn: 317087 | 
| | 
| 
| 
| | llvm-svn: 317072 | 
| | 
| 
| 
| | llvm-svn: 316964 | 
| | 
| 
| 
| 
| 
| | We don't need to extend/truncate the Known structure before calling computeKnownBits - it will reset at the start of the function.
llvm-svn: 316962 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html
This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:
IntrinsicInst
  -> MemIntrinsicBase (methods-only class)
    -> MemIntrinsic (non-atomic intrinsics)
      -> MemSetInst
      -> MemTransferInst
        -> MemCpyInst
        -> MemMoveInst
    -> AtomicMemIntrinsic (atomic intrinsics)
      -> AtomicMemSetInst
      -> AtomicMemTransferInst
        -> AtomicMemCpyInst
        -> AtomicMemMoveInst
    -> AnyMemIntrinsic (both atomicities)
      -> AnyMemSetInst
      -> AnyMemTransferInst
        -> AnyMemCpyInst
        -> AnyMemMoveInst
This involves some class renaming:
    ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
    ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
    ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.
An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).
---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"
REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"
FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )
SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"
for f in $FILES; do
    echo "Processing: $f"
    sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done
Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev
Reviewed By: sanjoy
Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38419
llvm-svn: 316950 | 
| | 
| 
| 
| | llvm-svn: 316947 | 
| | 
| 
| 
| | llvm-svn: 316944 | 
| | 
| 
| 
| | llvm-svn: 316933 | 
| | 
| 
| 
| | llvm-svn: 316875 | 
| | 
| 
| 
| 
| 
| | Introduce a isConstOrDemandedConstSplat helper function that can recognise a constant splat build vector for at least the demanded elts we care about.
llvm-svn: 316866 |