| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: This is a common pattern that arise when legalizing large integers operations. Only do it when Y + 1 cannot overflow as this would change the carry behavior of uaddo .
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32687
llvm-svn: 301922
|
|
|
|
|
|
|
|
|
|
| |
Summary: Common pattern when legalizing large integers operations. Similar to D32687, when the carry isn't used.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Differential Revision: https://reviews.llvm.org/D32738
llvm-svn: 301919
|
|
|
|
|
|
|
|
|
|
|
|
| |
argument types (PR31088)
PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.
This patch adds support for extension/truncation for both integer and float types.
Differential Revision: https://reviews.llvm.org/D32391
llvm-svn: 301910
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code only looks at half of the tree when matching bswap + rol patterns ending in an OR tree (as opposed to a cascade).
Patch originally introduced by Jim Lewis.
Submitted on the behalf of Dinar Temirbulatov.
Differential Revision: https://reviews.llvm.org/D32039
llvm-svn: 301907
|
|
|
|
|
|
|
|
|
|
| |
This tracks whether MaxCallFrameSize is computed yet. Ideally we would
assert and fail when the value is queried before it is computed, however
this fails various targets that need to be fixed first.
Differential Revision: https://reviews.llvm.org/D32570
llvm-svn: 301851
|
|
|
|
|
|
| |
This relands r301424.
llvm-svn: 301812
|
|
|
|
|
|
|
|
| |
Patch by: Gergely Angeli!
Differential Revision: https://reviews.llvm.org/D31936
llvm-svn: 301807
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for CTTZ/CTLZ operations.
This is the SelectionDAG version of D32521. If know where at least one 1 is located in the input to these intrinsics we can place an upper bound on the number of bits needed to represent the count and thus increase the number of known zeros in the output.
I think we can also refine this further for CTTZ_UNDEF/CTLZ_UNDEF by assuming that the answer will never be BitWidth. I've left this out for now because it caused other test failures across multiple targets. Usually because of turning ADD into OR based on this new information.
I'll fix CTPOP in a future patch.
Differential Revision: https://reviews.llvm.org/D32692
llvm-svn: 301806
|
|
|
|
|
|
|
|
| |
This removes BinaryWithFlagsSDNode, and flags are now all passed by value.
Differential Revision: https://reviews.llvm.org/D32527
llvm-svn: 301803
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(PR14657)
We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that
patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check
legality of the select that we want to produce.
A few things to note:
1. We can't wait until after legalization and do this generically because (at least in the x86
tests from PR14657), we'll have PACKSS and bitcasts in the pattern.
2. This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but
that requires a closer look to make sure we don't end up worse.
3. There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases.
That should be fixed next.
4. I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple
legal vector sizes, but if there are other targets like that, we should add more tests.
5. There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests):
despite IR that is terrible for the target, this patch allows us to generate the optimal loop
code because something post-ISEL is hoisting the splat extends above the vector loops.
Differential Revision: https://reviews.llvm.org/D32620
llvm-svn: 301781
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
uaddo/addcarry
Summary: As per discution on how to get better codegen an large int legalization, it became clear that using a glue for the carry was preventing several desirable optimizations. Passing the carry down as a value allow for more flexibility.
Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer
Subscribers: igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D29872
llvm-svn: 301775
|
|
|
|
|
|
| |
setLowBits where possible.
llvm-svn: 301768
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Predicate<> now has a field to indicate how often it must be recomputed.
Currently, there are two frequencies, per-module (RecomputePerFunction==0)
and per-function (RecomputePerFunction==1). Per-function predicates are
currently recomputed more frequently than necessary since the only predicate
in this category is cheap to test. Per-module predicates are now computed in
getSubtargetImpl() while per-function predicates are computed in selectImpl().
Tablegen now manages the PredicateBitset internally. It should only be
necessary to add the required includes.
Also fixed a problem revealed by the test case where
constrainSelectedInstRegOperands() would attempt to tie operands that
BuildMI had already tied.
Reviewers: ab, qcolombet, t.p.northover, rovka, aditya_nandakumar
Reviewed By: rovka
Subscribers: kristof.beyls, igorb, llvm-commits
Differential Revision: https://reviews.llvm.org/D32491
llvm-svn: 301750
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
negative/nonnegative number and add methods for changing the negative/nonnegative state
Summary: This patch adds isNegative, isNonNegative for querying whether the sign bit is known. It also adds makeNegative and makeNonNegative for controlling the sign bit.
Reviewers: RKSimon, spatel, davide
Reviewed By: RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32651
llvm-svn: 301747
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AttributeList"
This broke the Clang build. (Clang-side patch missing?)
Original commit message:
> [IR] Make add/remove Attributes use AttrBuilder instead of
> AttributeList
>
> This change cleans up call sites and avoids creating temporary
> AttributeList objects.
>
> NFC
llvm-svn: 301712
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the issue highlighted in
http://lists.llvm.org/pipermail/cfe-dev/2014-June/037500.html.
The DW_AT_decl_file and DW_AT_decl_line attributes on namespaces can
prevent LLVM from uniquing types that are in the same namespace. They
also don't carry any meaningful information.
rdar://problem/17484998
Differential Revision: https://reviews.llvm.org/D32648
llvm-svn: 301706
|
|
|
|
|
|
|
|
|
| |
When a PHI operand has a subregister, create a COPY instead of simply
replacing the PHI output with the input it.
Differential Revision: https://reviews.llvm.org/D32650
llvm-svn: 301699
|
|
|
|
|
|
|
|
|
| |
This change cleans up call sites and avoids creating temporary
AttributeList objects.
NFC
llvm-svn: 301697
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The method is called "get *Param* Alignment", and is only used for
return values exactly once, so it should take argument indices, not
attribute indices.
Avoids confusing code like:
IsSwiftError = CS->paramHasAttr(ArgIdx, Attribute::SwiftError);
Alignment = CS->getParamAlignment(ArgIdx + 1);
Add getRetAlignment to handle the one case in Value.cpp that wants the
return value alignment.
This is a potentially breaking change for out-of-tree backends that do
their own call lowering.
llvm-svn: 301682
|
|
|
|
| |
llvm-svn: 301681
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a new method finalizeLowering to TargetLoweringBase. This is in
preparation for an upcoming commit.
This function is meant for target specific adjustments to
MachineFrameInfo or register reservations.
Move the freezeRegisters() and the hasCopyImplyingStackAdjustment()
handling into the new function to prove the concept. As an added bonus
GlobalISel no longer missed the hasCopyImplyingStackAdjustment()
handling with this.
Differential Revision: https://reviews.llvm.org/D32621
llvm-svn: 301679
|
|
|
|
| |
llvm-svn: 301673
|
|
|
|
|
|
|
|
|
|
|
| |
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.
NFC
llvm-svn: 301666
|
|
|
|
| |
llvm-svn: 301665
|
|
|
|
| |
llvm-svn: 301656
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets
```
lor.lhs.false2: ; preds = %if.then
switch i32 %Status, label %if.then27 [
i32 -7012, label %if.end35
i32 -10008, label %if.end35
i32 -10016, label %if.end35
i32 15000, label %if.end35
i32 14013, label %if.end35
i32 10114, label %if.end35
i32 10107, label %if.end35
i32 10105, label %if.end35
i32 10013, label %if.end35
i32 10011, label %if.end35
i32 7008, label %if.end35
i32 7007, label %if.end35
i32 5002, label %if.end35
]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)
```
.LBB853_9: // %lor.lhs.false2
mov w8, #10012
cmp w19, w8
b.gt .LBB853_14
// BB#10: // %lor.lhs.false2
mov w8, #5001
cmp w19, w8
b.gt .LBB853_18
// BB#11: // %lor.lhs.false2
mov w8, #-10016
cmp w19, w8
b.eq .LBB853_23
// BB#12: // %lor.lhs.false2
mov w8, #-10008
cmp w19, w8
b.eq .LBB853_23
// BB#13: // %lor.lhs.false2
mov w8, #-7012
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_14: // %lor.lhs.false2
mov w8, #14012
cmp w19, w8
b.gt .LBB853_21
// BB#15: // %lor.lhs.false2
mov w8, #-10105
add w8, w19, w8
cmp w8, #9 // =9
b.hi .LBB853_17
// BB#16: // %lor.lhs.false2
orr w9, wzr, #0x1
lsl w8, w9, w8
mov w9, #517
and w8, w8, w9
cbnz w8, .LBB853_23
.LBB853_17: // %lor.lhs.false2
mov w8, #10013
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_18: // %lor.lhs.false2
mov w8, #-7007
add w8, w19, w8
cmp w8, #2 // =2
b.lo .LBB853_23
// BB#19: // %lor.lhs.false2
mov w8, #5002
cmp w19, w8
b.eq .LBB853_23
// BB#20: // %lor.lhs.false2
mov w8, #10011
cmp w19, w8
b.eq .LBB853_23
b .LBB853_3
.LBB853_21: // %lor.lhs.false2
mov w8, #14013
cmp w19, w8
b.eq .LBB853_23
// BB#22: // %lor.lhs.false2
mov w8, #15000
cmp w19, w8
b.ne .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.
This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
-inline-generic-switch-cost=false
This change was originally proposed by Haicheng in D29870.
Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier
Reviewed By: hans
Subscribers: joerg, aemerson, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D31085
llvm-svn: 301649
|
|
|
|
|
|
|
|
| |
ASHR and INSERT_VECTOR_ELT (reapplied)
Reapplied r299221 after fix for nondeterminism in ThinLTO builder (rL301599), with extra check for implicit truncation of inserted element.
llvm-svn: 301644
|
|
|
|
|
|
| |
struct.
llvm-svn: 301626
|
|
|
|
|
|
|
|
|
|
|
|
| |
simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.
This is largely a mechanical transformation from KnownZero to Known.Zero.
Differential Revision: https://reviews.llvm.org/D32569
llvm-svn: 301620
|
|
|
|
|
|
| |
This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version.
llvm-svn: 301618
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In some cases LLVM (especially the SLP vectorizer) will create vectors
that are 256 bytes (or larger). Given that this is intentional[0] is
likely to get more common, this patch updates the StackMap binary
format to deal with the spill locations for said vectors.
This change also bumps the stack map version from 2 to 3.
[0]: https://reviews.llvm.org/D32533#738350
Reviewers: reames, kavon, skatkov, javed.absar
Subscribers: mcrosier, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D32629
llvm-svn: 301615
|
|
|
|
| |
llvm-svn: 301612
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The type of the target frame index is intptr, not the type of the value we're
going to store into it. Without this change we crash in the attached test case
when trying to type-legalize a TargetFrameIndex.
Patchpoint lowering types the target frame index as intptr as well.
Reviewers: reames, bogner, arsenm
Subscribers: arsenm, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D32256
llvm-svn: 301566
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a lot of very similarly named classes related to
dealing with module debug info. This patch has NFC, it just
renames some classes to be more descriptive (albeit slightly
more to type). The mapping from old to new class names is as
follows:
Old | New
ModInfo | DbiModuleDescriptor
ModuleSubstream | ModuleDebugFragment
ModStream | ModuleDebugStream
With the corresponding Builder classes renamed accordingly.
Differential Revision: https://reviews.llvm.org/D32506
llvm-svn: 301555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DISubprogram currently has 10 pointer operands, several of which are
often nullptr. This patch reduces the amount of memory allocated by
DISubprogram by rearranging the operands such that containing type,
template params, and thrown types come last, and are only allocated
when they are non-null (or followed by non-null operands).
This patch also eliminates the entirely unused DisplayName operand.
This saves up to 4 pointer operands per DISubprogram. (I tried
measuring the effect on peak memory usage on an LTO link of an X86
llc, but the results were very noisy).
This reapplies r301498 with an attempted workaround for g++.
Differential Revision: https://reviews.llvm.org/D32560
llvm-svn: 301501
|
|
|
|
|
|
| |
This reverts commit r301498 while investigating bot breakage.
llvm-svn: 301499
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DISubprogram currently has 10 pointer operands, several of which are
often nullptr. This patch reduces the amount of memory allocated by
DISubprogram by rearranging the operands such that containing type,
template params, and thrown types come last, and are only allocated
when they are non-null (or followed by non-null operands).
This patch also eliminates the entirely unused DisplayName operand.
This saves up to 4 pointer operands per DISubprogram. (I tried
measuring the effect on peak memory usage on an LTO link of an X86
llc, but the results were very noisy).
llvm-svn: 301498
|
|
|
|
|
|
|
| |
Move implementation of the MachineFrameInfo class into
MachineFrameInfo.cpp
llvm-svn: 301494
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For Swift we would like to be able to encode the error types that a
function may throw, so the debugger can display them alongside the
function's return value when finish-ing a function.
DWARF defines DW_TAG_thrown_type (intended to be used for C++ throw()
declarations) that is a perfect fit for this purpose. This patch wires
up support for DW_TAG_thrown_type in LLVM by adding a list of thrown
types to DISubprogram.
To offset the cost of the extra pointer, there is a follow-up patch
that turns DISubprogram into a variable-length node.
rdar://problem/29481673
Differential Revision: https://reviews.llvm.org/D32559
llvm-svn: 301489
|
|
|
|
|
|
|
|
|
|
|
|
| |
Besides better codegen, the motivation is to be able to canonicalize this pattern
in IR (currently we don't) knowing that the backend is prepared for that.
This may also allow removing code for special constant cases in
DAGCombiner::foldSelectOfConstants() that was added in D30180.
Differential Revision: https://reviews.llvm.org/D31944
llvm-svn: 301457
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
computeKnownBits
This patch introduces a new KnownBits struct that wraps the two APInt used by computeKnownBits. This allows us to treat them as more of a unit.
Initially I've just altered the signatures of computeKnownBits and InstCombine's simplifyDemandedBits to pass a KnownBits reference instead of two separate APInt references. I'll do similar to the SelectionDAG version of computeKnownBits/simplifyDemandedBits as a separate patch.
I've added a constructor that allows initializing both APInts to the same bit width with a starting value of 0. This reduces the repeated pattern of initializing both APInts. Once place default constructed the APInts so I added a default constructor for those cases.
Going forward I would like to add more methods that will work on the pairs. For example trunc, zext, and sext occur on both APInts together in several places. We should probably add a clear method that can be used to clear both pieces. Maybe a method to check for conflicting information. A method to return (Zero|One) so we don't write it out everywhere. Maybe a method for (Zero|One).isAllOnesValue() to determine if all bits are known. I'm sure there are many other methods we can come up with.
Differential Revision: https://reviews.llvm.org/D32376
llvm-svn: 301432
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commits were:
"Use WeakVH instead of WeakTrackingVH in AliasSetTracker's UnkownInsts"
"Add a new WeakVH value handle; NFC"
"Rename WeakVH to WeakTrackingVH; NFC"
The changes assumed pointers are 8 byte aligned on all architectures.
llvm-svn: 301429
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
I plan to use WeakVH to mean "nulls itself out on deletion, but does
not track RAUW" in a subsequent commit.
Reviewers: dblaikie, davide
Reviewed By: davide
Subscribers: arsenm, mehdi_amini, mcrosier, mzolotukhin, jfb, llvm-commits, nhaehnle
Differential Revision: https://reviews.llvm.org/D32266
llvm-svn: 301424
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Build vectors have magical truncation powers, so we have things like this:
v4i1 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
v4i16 = BUILD_VECTOR Constant:i32<1>, Constant:i32<1>, Constant:i32<1>, Constant:i32<1>
If we don't truncate the splat node returned by getConstantSplatNode(), then we won't find
truth when ZeroOrNegativeOneBooleanContent is the rule.
Differential Revision: https://reviews.llvm.org/D32505
llvm-svn: 301408
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For targets that don't have ISD::MULHS or ISD::SMUL_LOHI for the type
and the double width type is illegal, then the two operands are
sign extended to twice their size then multiplied to check for overflow.
The extended upper halves were mismatched causing an incorrect result.
This fixes the mismatch.
A test was added for ARM V6-M where the bug was detected.
Patch by James Duley.
Differential Revision: https://reviews.llvm.org/D31807
llvm-svn: 301404
|
|
|
|
| |
llvm-svn: 301403
|
|
|
|
|
|
| |
0bH is now supported in MS asm.
llvm-svn: 301390
|
|
|
|
| |
llvm-svn: 301383
|
|
|
|
| |
llvm-svn: 301366
|
|
|
|
|
|
|
|
| |
The fix consists of resetting LocationKind when addMachineRegExpression fails.
rdar://problem/31803010
llvm-svn: 301351
|