| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r339536 | ctopper | 2018-08-13 08:53:49 +0200 (Mon, 13 Aug 2018) | 3 lines
[SelectionDAG] In PromoteFloatOp_BITCAST, insert a bitcast after the fp_to_fp16 in case the result type isn't a scalar integer.
This is another variation of PR38533. In this case, the result type of the bitcast is legal and 16-bits wide, but not a scalar integer. So we need to emit the convert to i16 and then bitcast it to the true result type. This new bitcast will be further type legalized if necessary.
------------------------------------------------------------------------
llvm-svn: 339857
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r339535 | ctopper | 2018-08-13 08:53:47 +0200 (Mon, 13 Aug 2018) | 5 lines
[SelectionDAG] In PromoteIntRes_BITCAST, when the input is TypePromoteFloat, make sure the output type is scalar. For vectors, use a store and load of temporary.
Previously if the result type was a vector, we emitted a FP_TO_FP16 with a vector result type which isn't valid.
This is basically the opposite case of the root cause of PR38533.
------------------------------------------------------------------------
llvm-svn: 339856
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r339533 | ctopper | 2018-08-13 07:26:49 +0200 (Mon, 13 Aug 2018) | 5 lines
[SelectionDAG] In PromoteFloatRes_BITCAST, insert a bitcast before the fp16_to_fp in case the input type isn't an i16.
The bitcast can be further legalized as needed.
Fixes PR38533.
------------------------------------------------------------------------
llvm-svn: 339855
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r339600 | scott.linder | 2018-08-13 20:44:21 +0200 (Mon, 13 Aug 2018) | 8 lines
[CodeGen] Fix assert in SelectionDAG::computeKnownBits
Fix SelectionDAG::computeKnownBits asserting when handling EXTRACT_SUBVECTOR
when zero extending the demanded elements mask if it is already as long as the
source vector.
Differential Revision: https://reviews.llvm.org/D49574
------------------------------------------------------------------------
llvm-svn: 339664
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r339225 | thopre | 2018-08-08 11:35:26 +0200 (Wed, 08 Aug 2018) | 11 lines
Support inline asm with multiple 64bit output in 32bit GPR
Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR).
Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma
Reviewed By: efriedma
Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D45437
------------------------------------------------------------------------
llvm-svn: 339539
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r338915 | ctopper | 2018-08-03 22:14:18 +0200 (Fri, 03 Aug 2018) | 5 lines
[SelectionDAG] Teach LegalizeVectorTypes to widen the mask input to a masked store.
The mask operand is visited before the data operand so we need to be able to widen it.
Fixes PR38436.
------------------------------------------------------------------------
llvm-svn: 339106
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
------------------------------------------------------------------------
r338665 | lliu0 | 2018-08-02 03:54:12 +0200 (Thu, 02 Aug 2018) | 11 lines
Fix FCOPYSIGN expansion
In expansion of FCOPYSIGN, the shift node is missing when the two
operands of FCOPYSIGN are of the same size. We should always generate
shift node (if the required shift bit is not zero) to put the sign
bit into the right position, regardless of the size of underlying
types.
Differential Revision: https://reviews.llvm.org/D49973
------------------------------------------------------------------------
llvm-svn: 339098
|
|
|
|
|
|
|
|
|
|
| |
Correct the address space for the inserted argument
stack slot.
AMDGPU seems to not do anything with this information,
so I don't think this was breaking anything.
llvm-svn: 338428
|
|
|
|
| |
llvm-svn: 338382
|
|
|
|
| |
llvm-svn: 338352
|
|
|
|
|
|
|
|
| |
BuildSDIV/BuildUDIV/etc.
The vector contains the SDNodes that these functions create. The number of nodes is always a small number so we should use SmallVector to avoid a heap allocation.
llvm-svn: 338329
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is exchanging a sub-of-1 with add-of-minus-1:
https://rise4fun.com/Alive/plKAH
This is another step towards improving select-of-constants codegen (see D48970).
x86 is the motivating target, and those diffs all appear to be wins. PPC and AArch64 look neutral.
I've limited this to early combining (!LegalOperations) in case a target wants to reverse it, but
I think canonicalizing to 'add' is more likely to produce further transforms because we have more
folds for 'add'.
Differential Revision: https://reviews.llvm.org/D49924
llvm-svn: 338317
|
|
|
|
|
|
| |
BuildUDIV was already correct.
llvm-svn: 338304
|
|
|
|
|
|
| |
BuildSDIVPow2.
llvm-svn: 338303
|
|
|
|
|
|
|
|
| |
Thinking about it more it might be possible for the later nodes to be folded in getNode in such a way that the other created nodes are left dead. This can cause use counts to be incorrect on nodes that aren't dead.
So its probably safer to leave this alone.
llvm-svn: 338298
|
|
|
|
|
|
| |
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}
llvm-svn: 338293
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rotate can be formed
Summary:
Attempt to extract a shrl from a udiv or a shl from a mul if this allows a rotate to be formed. This targets cases where the input to a rotate pattern was a mul or udiv by a constant and InstCombine merged one of the shifts with the op.
Patch by: sameconrad (Sam Conrad)
Reviewers: RKSimon, craig.topper, spatel, lebedev.ri, javed.absar
Reviewed By: lebedev.ri
Subscribers: efriedma, kparzysz, llvm-commits
Differential Revision: https://reviews.llvm.org/D47681
llvm-svn: 338270
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reapplies commit r338206 reverted by r338214 since the bug that
r338206 uncovered has been fixed in r338268.
Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.
llvm-svn: 338269
|
|
|
|
|
|
|
|
| |
The DAGCombiner has a mechanism for ensuring all nodes have been visited at least once. Every time a node is visited, it makes sure its operands have been in the worklist at least once. This ensures that when multiple nodes are created by a combine, only the last node needs to be returned. The earlier nodes can all be found Through this operand check. These means we don't need to explicitly add nodes to the worklist when a combine creates multiple nodes.
I've removed the most obvious cases here. There are probably more than can be removed.
llvm-svn: 338222
|
|
|
|
|
|
|
| |
Example of bot failure:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll
llvm-svn: 338214
|
|
|
|
|
|
|
|
|
| |
Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.
llvm-svn: 338206
|
|
|
|
|
|
|
|
|
|
| |
BuildSDIV/BuildUDIV.
This removes the need for an assert to ensure the pointer isn't null.
Years ago we had ifs the checked the pointer was non-null before very access to the vector. These checks were removed and replaced with a single assert. But a reference seems more suitable here.
llvm-svn: 338205
|
|
|
|
|
|
|
|
| |
This seems like a pretty glaring omission, and AMDGPU
wants to treat kernels differently from other calling
conventions.
llvm-svn: 338194
|
|
|
|
|
|
|
|
| |
This can be useful since addition is commutable, and subtraction is not.
This matches a transform that is also done by InstCombine.
llvm-svn: 338181
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up suggested in D48970.
Alive proofs:
https://rise4fun.com/Alive/sII
We can eliminate an instruction in the usual select-of-constants
to bit hack transform by adjusting the add/sub with constant.
This is always a win.
There are more transforms that are likely wins, but they may need
target hooks in case some targets do not benefit.
This is another step towards making up for canonicalizing to
select-of-constants in rL331486.
llvm-svn: 338132
|
|
|
|
| |
llvm-svn: 338112
|
|
|
|
|
|
|
|
|
|
| |
DAG.setRoot.
Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load.
This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization.
llvm-svn: 338085
|
|
|
|
|
|
| |
properly calculate their folding set ID to allow them to be CSEd.
llvm-svn: 338080
|
|
|
|
|
|
| |
The DAGCombiner has a system for ensuring all nodes are visited. It doesn't require an AddToWorkList for every node that is created by a combine.
llvm-svn: 338079
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LowerDbgDeclare inserts a dbg.value before each use of an address
described by a dbg.declare. When inserting a dbg.value before a CallInst
use, however, it fails to append DW_OP_deref to the DIExpression.
The DW_OP_deref is needed to reflect the fact that a dbg.value describes
a source variable directly (as opposed to a dbg.declare, which relies on
pointer indirection).
This patch adds in the DW_OP_deref where needed. This results in the
correct values being shown during a debug session for a program compiled
with ASan and optimizations (see https://reviews.llvm.org/D49520). Note
that ConvertDebugDeclareToDebugValue is already correct -- no changes
there were needed.
One complication is that SelectionDAG is unable to distinguish between
direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also
fixes this long-standing issue in order to not regress integration tests
relying on the incorrect assumption that all frame-index SDDbgValues are
indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot
be lowered properly otherwise. Basically the fix prevents a direct
SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice
by a debugger. There were a handful of tests relying on this incorrect
"FRAMEIX => indirect" assumption which actually had incorrect
DW_AT_locations: these are all fixed up in this patch.
Testing:
- check-llvm, and an end-to-end test using lldb to debug an optimized
program.
- Existing unit tests for DIExpression::appendToStack fully cover the
new DIExpression::append utility.
- check-debuginfo (the debug info integration tests)
Differential Revision: https://reviews.llvm.org/D49454
llvm-svn: 338069
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
A follow-up for D49266 / rL337166.
At least one of these cases is more canonical,
so we really do have to handle it.
https://godbolt.org/g/pkzP3X
https://rise4fun.com/Alive/pQyhZZ
We won't get to these cases with I1 being -1,
as that will be constant-folded to true or false.
I'm also not sure we actually hit the 'ule' case,
but i think the worst think that could happen is that being dead code.
Reviewers: spatel, craig.topper, RKSimon, javed.absar, efriedma
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D49497
llvm-svn: 338044
|
|
|
|
|
|
|
|
|
|
|
| |
If the DAGCombiner's rotate matching was working as expected,
I don't think we'd see any test diffs here.
This sidesteps the issue of custom lowering for rotates raised in PR38243:
https://bugs.llvm.org/show_bug.cgi?id=38243
...by only dealing with legal operations.
llvm-svn: 337966
|
|
|
|
|
|
|
|
|
|
|
|
| |
When VectorLegalizer::LegalizeOp creates a new SDValue after iterating
over its arguments, we need to refer to the same result number of the
new node that the original value used.
Reviewed by: cameron.mcinally
Differential Revision: https://reviews.llvm.org/D49805
llvm-svn: 337939
|
|
|
|
|
|
|
|
| |
Add support for inline assembly with output operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR as in the PR).
llvm-svn: 337903
|
|
|
|
|
|
|
| |
This avoids approx. 2 x 10^5 DenseMap insertions in both non-debug and
debug -O2 builds of the sqlite3 amalgamation.
llvm-svn: 337751
|
|
|
|
|
|
|
| |
scalarizeVectorLoad creates MERGE_VALUES nodes which are immediately
decomposed in expandLoad. Elide the node in these cases.
llvm-svn: 337708
|
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D48809
llvm-svn: 337698
|
|
|
|
|
|
|
|
| |
APInt::getZExtValue to 0 in a place where we can't be sure contents of the APInt fit in a uint64_t.
This is used on an extract vector element index which is most cases is going to be an i32 or i64 and the element will be a valid element number. But it is possible to construct IR with a larger type and large out of range value.
llvm-svn: 337652
|
|
|
|
|
|
|
|
| |
of 2 number of elements.
The check for the shuffles usages probably isn't correct for non power of 2 vectors.
llvm-svn: 337651
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for construction-time folding for incomplete AND nodes in
BackwardsPropagateMask.
Fixes PR38185.
Reviewers: RKSimon, samparker
Reviewed By: samparker
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D49444
llvm-svn: 337563
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When merging through a TokenFactor we need to check that the
load may be ordered such that no other aliasing memory operations may
happen. It is not sufficient to just check that the load is a member
of the chain token factor as it there may be a indirect chain. Require
the load's chain has only one use.
This fixes PR37826.
Reviewers: spatel, davide, efriedma, craig.topper, RKSimon
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D49388
llvm-svn: 337560
|
|
|
|
| |
llvm-svn: 337518
|
|
|
|
|
|
|
|
| |
Mirrors the existing exit path for f128, avoiding a crash later on.
Differential Revision: https://reviews.llvm.org/D49524
llvm-svn: 337506
|
|
|
|
|
|
| |
We already knew A+(-B) is A-B in visitAdd. This does the opposite for visitSub.
llvm-svn: 337502
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If unfolding an SUnit results in both load or the operation using it which
already exist in the DAG, abort the unfold if they are already scheduled.
If not, make sure we don't add duplicate dependencies.
This fixes PR37916.
Reviewers: davide, eli.friedman, fhahn, bogner
Subscribers: MatzeB, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D48666
llvm-svn: 337409
|
|
|
|
|
|
|
|
| |
If we are only extracting vector elements via EXTRACT_VECTOR_ELT(s) we may be able to use SimplifyDemandedVectorElts to avoid unnecessary vector ops.
Differential Revision: https://reviews.llvm.org/D49262
llvm-svn: 337258
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html
http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html
We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions,
and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation
by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift"
operation which may also be a single hardware instruction.
Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working
on that (much larger patch), but then I concluded that we don't need it. At least as a first
step, we have all of the backend support necessary to match these ops...because it was required.
And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are
likely all that we'll ever need.
There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes
(along with improving the oversized shift documentation). Again, I don't think that's strictly
necessary (as the test results here prove). That can be an efficiency improvement as a small
follow-up patch.
So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support.
Differential Revision: https://reviews.llvm.org/D49242
llvm-svn: 337221
|
|
|
|
| |
llvm-svn: 337200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
[[ https://bugs.llvm.org/show_bug.cgi?id=38149 | PR38149 ]]
As discussed in https://reviews.llvm.org/D49179#1158957 and later,
the IR for 'check for [no] signed truncation' pattern can be improved:
https://rise4fun.com/Alive/gBf
^ that pattern will be produced by Implicit Integer Truncation sanitizer,
https://reviews.llvm.org/D48958 https://bugs.llvm.org/show_bug.cgi?id=21530
in signed case, therefore it is probably a good idea to improve it.
But the IR-optimal patter does not lower efficiently, so we want to undo it..
This handles the simple pattern.
There is a second pattern with predicate and constants inverted.
NOTE: we do not check uses here. we always do the transform.
Reviewers: spatel, craig.topper, RKSimon, javed.absar
Reviewed By: spatel
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D49266
llvm-svn: 337166
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If the high part of the load is not used the offset to the next element
will not be set correctly.
For example, on Sparc V8, the following code will read val2 from offset 4
instead of 8.
```
int val = __builtin_va_arg(va, long long);
int val2 = __builtin_va_arg(va, int);
```
Reviewers: jyknight
Reviewed By: jyknight
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D48595
llvm-svn: 337161
|