| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
CRC and GINV ASE require revision 6, Virtualization requires revision 5.
Print a warning when revision is older than required.
Differential Revision: https://reviews.llvm.org/D48843
llvm-svn: 336296
|
|
|
|
|
|
|
|
|
|
|
| |
We want to run the Machine Scheduler instead of the List Scheduler after RA.
Checked with a performance run on a Power 9 machine with SPEC 2006 and while
some benchmarks improved and others degraded the geomean was slightly improved
with the Machine Scheduler.
Differential Revision: https://reviews.llvm.org/D45265
llvm-svn: 336295
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have bailout hacks based on min/max in various places in instcombine
that shouldn't be necessary. The affected test was added for:
D48930
...which is a consequence of the improvement in:
D48584 (https://reviews.llvm.org/rL336172)
I'm assuming the visitTrunc bailout in this patch was added specifically
to avoid a change from SimplifyDemandedBits, so I'm just moving that
below the EvaluateInDifferentType optimization. A narrow min/max is still
a min/max.
llvm-svn: 336293
|
|
|
|
| |
llvm-svn: 336291
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for negative immediates was implemented in
https://reviews.llvm.org/rL298380, however few instruction options were missing.
This change adds negative immediates support and respective tests
for the following:
ADD
ADDS
ADDS.W
AND.W
ANDS
BIC.W
BICS
BICS.W
SUB
SUBS
SUBS.W
Differential Revision: https://reviews.llvm.org/D48649
llvm-svn: 336286
|
|
|
|
|
|
|
|
| |
getOutlininingCandidateInfo -> getOutliningCandidateInfo
Differential Revision: https://reviews.llvm.org/D48867
llvm-svn: 336285
|
|
|
|
| |
llvm-svn: 336284
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ThinLTO cache file access times are used for expiration based pruning
and since Vista, file access times are not updated by Windows by
default:
https://blogs.technet.microsoft.com/filecab/2006/11/07/disabling-last-access-time-in-windows-vista-to-improve-ntfs-performance
This means on Windows, cache files are currently being pruned from
creation time. This change manually updates cache files that are
accessed by ThinLTO, when on Windows.
Patch by Owen Reynolds.
Differential Revision: https://reviews.llvm.org/D47266
llvm-svn: 336276
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds both a vector and an immediate form, e.g.
- Vector form:
subr z0.h, p0/m, z0.h, z1.h
subtract active elements of z0 from z1, and store the result in z0.
- Immediate form:
subr z0.h, z0.h, #255
subtract elements of z0, and store the result in z0.
llvm-svn: 336274
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Includes instructions to read the First-Faulting Register (FFR):
- RDFFR (unpredicated)
rdffr p0.b
- RDFFR (predicated)
rdffr p0.b, p0/z
- RDFFRS (predicated, sets condition flags)
rdffr p0.b, p0/z
Includes instructions to set/write the FFR:
- SETFFR (no arguments, sets the FFR to all true)
setffr
- WRFFR (unpredicated)
wrffr p0.b
llvm-svn: 336267
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
- fcvt (FP convert precision)
- scvtf (signed int -> FP)
- ucvtf (unsigned int -> FP)
- fcvtzs (FP -> signed int (round to zero))
- fcvtzu (FP -> unsigned int (round to zero))
For example:
fcvt z0.h, p0/m, z0.s (single- to half-precision FP)
scvtf z0.h, p0/m, z0.s (32-bit int to half-precision FP)
ucvtf z0.h, p0/m, z0.s (32-bit unsigned int to half-precision FP)
fcvtzs z0.s, p0/m, z0.h (half-precision FP to 32-bit int)
fcvtzu z0.s, p0/m, z0.h (half-precision FP to 32-bit unsigned int)
llvm-svn: 336265
|
|
|
|
|
|
|
|
|
| |
When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.
Differential Revision: https://reviews.llvm.org/D48769
llvm-svn: 336256
|
|
|
|
|
|
|
|
|
|
|
| |
When zext is EvaluatedInDifferentType, InstCombine
drops the dbg.value intrinsic. This patch tries to
preserve said DI, by inserting the zext's old DI in the
resulting instruction. (Only for integer type for now)
Differential Revision: https://reviews.llvm.org/D48331
llvm-svn: 336254
|
|
|
|
|
|
|
|
| |
We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case.
Reapplied with a fixed (extra null tests) version of rL336113 after reversion in rL336189 - extra test case added at rL336247.
llvm-svn: 336250
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SVE overloads the AArch64 PSTATE condition flags and introduces
a set of condition code aliases for the assembler. The
details are described in section 2.2 of the architecture
reference manual supplement for SVE.
In short:
SVE alias => AArch64 name
--------------------------
NONE => EQ
ANY => NE
NLAST => HS
LAST => LO
FIRST => MI
NFRST => PL
PMORE => HI
PLAST => LS
TCONT => GE
TSTOP => LT
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48869
llvm-svn: 336245
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following code pattern:
mov %rax, %rcx
test %rax, %rax
%rax = ....
je throw_npe
mov(%rcx), %r9
mov(%rax), %r10
gets transformed into the following incorrect code after implicit null check pass:
mov %rax, %rcx
%rax = ....
faulting_load_op("movl (%rax), %r10", throw_npe)
mov(%rcx), %r9
For implicit null check pass, if the register that is checked for null value (ie, the register used in the 'test' instruction) is written into before the condition jump, we should avoid doing the optimization.
Patch by Surya Kumari Jangala!
Differential Revision: https://reviews.llvm.org/D48627
Reviewed By: skatkov
llvm-svn: 336241
|
|
|
|
|
|
| |
Loads and stores less than 64-bits are already atomic, this adds support for a special case thereof. This needs to be expanded.
llvm-svn: 336236
|
|
|
|
| |
llvm-svn: 336232
|
|
|
|
|
|
| |
Vectorization can create them.
llvm-svn: 336227
|
|
|
|
|
|
| |
pasted. NFC
llvm-svn: 336226
|
|
|
|
| |
llvm-svn: 336223
|
|
|
|
|
|
| |
definitions
llvm-svn: 336222
|
|
|
|
|
|
|
|
| |
This patch adds a new token type specifically for (%dx). We will now always create this token when we parse (%dx). After all operands have been parsed, if the mnemonic is in/out we'll morph this token to a regular register token. Otherwise we keep it as the special DX token which won't match any instructions.
This removes the need for passing Mnemonic through the parsing functions. It also seems closer to gas where when its used on the wrong instruction it just gets diagnosed as an invalid operand rather than a bad memory address.
llvm-svn: 336218
|
|
|
|
|
|
|
|
| |
This might make the error message added in r335668 unneeded, but I'm not sure yet.
The check for RIP is technically unnecessary since RIP is in GR64, but that fact is kind of surprising so be explicit.
llvm-svn: 336217
|
|
|
|
| |
llvm-svn: 336216
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As the test diffs show, the current users of getBinOpIdentity()
are InstCombine and Reassociate. SLP vectorizer is a candidate
for using this functionality too (D28907).
The InstCombine shuffle improvements are part of the planned
enhancements noted in D48830.
InstCombine actually has several other uses of getBinOpIdentity()
via SimplifyUsingDistributiveLaws(), but we don't call that for
any FP ops. Fixing that might be another part of removing the
custom reassociation in InstCombine that is only done for fadd+fmul.
llvm-svn: 336215
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variants added in this patch are:
- Predicated Complex floating point ADD with rotate, e.g.
fcadd z0.h, p0/m, z0.h, z1.h, #90
- Predicated Complex floating point MLA with rotate, e.g.
fcmla z0.h, p0/m, z1.h, z2.h, #180
- Unpredicated Complex floating point MLA with rotate (indexed operand), e.g.
fcmla z0.h, p0/m, z1.h, z2.h[0], #180
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48824
llvm-svn: 336210
|
|
|
|
|
|
|
|
|
|
| |
unselectable stores.
r336120 resulted in falling back to SelectionDAG more often due to the G_STORE
MMOs not matching the vreg size. This fixes that by explicitly any-extending the
value.
llvm-svn: 336209
|
|
|
|
|
|
| |
Suggested in review for D48698.
llvm-svn: 336207
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unpredicated FP-multiply of SVE vector with a vector-element given by
vector[index], for example:
fmul z0.s, z1.s, z2.s[0]
which performs an unpredicated FP-multiply of all 32-bit elements in
'z1' with the first element from 'z2'.
This patch adds restricted register classes for SVE vectors:
ZPR_3b (only z0..z7 are allowed) - for indexed vector of 16/32-bit elements.
ZPR_4b (only z0..z15 are allowed) - for indexed vector of 64-bit elements.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48823
llvm-svn: 336205
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch includes support for the following instructions:
ABS z0.h, p0/m, z0.h
NEG z0.h, p0/m, z0.h
(S|U)XTB z0.h, p0/m, z0.h
(S|U)XTB z0.s, p0/m, z0.s
(S|U)XTB z0.d, p0/m, z0.d
(S|U)XTH z0.s, p0/m, z0.s
(S|U)XTH z0.d, p0/m, z0.d
(S|U)XTW z0.d, p0/m, z0.d
llvm-svn: 336204
|
|
|
|
|
|
| |
Now that D45806 has landed, we can re-enable support for MIN_SIGNED_VALUE in the sdiv by pow2-constant code
llvm-svn: 336198
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the last significant change suggested in PR37806:
https://bugs.llvm.org/show_bug.cgi?id=37806#c5
...though there are several follow-ups noted in the code comments
in this patch to complete this transform.
It's possible that a binop feeding a select-shuffle has been eliminated
by earlier transforms (or the code was just written like this in the 1st
place), so we'll fail to match the patterns that have 2 binops from:
D48401,
D48678,
D48662,
D48485.
In that case, we can try to materialize identity constants for the remaining
binop to fill in the "ghost" lanes of the vector (where we just want to pass
through the original values of the source operand).
I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups.
For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor).
Differential Revision: https://reviews.llvm.org/D48830
llvm-svn: 336196
|
|
|
|
|
|
|
|
|
|
| |
With a view to support parallel operations that have their results
stored to memory, refactor the consecutive access helper out so it
could support stores instructions.
Differential Revision: https://reviews.llvm.org/D48872
llvm-svn: 336195
|
|
|
|
| |
llvm-svn: 336194
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the following system registers:
- RAS registers,
- MPAM registers,
- Activitiy monitor registers,
- Trace Extension registers,
- Timing insensitivity of data processing instructions,
- Enhanced Support for Nested Virtualization.
Differential Revision: https://reviews.llvm.org/D48871
llvm-svn: 336193
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When salvaging a dbg.declare/dbg.addr we should not add
DW_OP_stack_value to the DIExpression
(see test/Transforms/InstCombine/salvage-dbg-declare.ll).
Consider this example
%vla = alloca i32, i64 2
call void @llvm.dbg.declare(metadata i32* %vla, metadata !1, metadata !DIExpression())
Instcombine will turn it into
%vla1 = alloca [2 x i32]
%vla1.sub = getelementptr inbounds [2 x i32], [2 x i32]* %vla, i64 0, i64 0
call void @llvm.dbg.declare(metadata [2 x i32]* %vla1.sub, metadata !19, metadata !DIExpression())
If the GEP can be eliminated, then the dbg.declare will be salvaged
and we should get
%vla1 = alloca [2 x i32]
call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression())
The problem was that salvageDebugInfo did not recognize dbg.declare
as being indirect (%vla1 points to the value, it does not hold the
value), so we incorrectly got
call void @llvm.dbg.declare(metadata [2 x i32]* %vla1, metadata !19, metadata !DIExpression(DW_OP_stack_value))
I also made sure that llvm::salvageDebugInfo and
DIExpression::prependOpcodes do not add DW_OP_stack_value to
the DIExpression in case no new operands are added to the
DIExpression. That way we avoid to, unneccessarily, turn a
register location expression into an implicit location expression
in some situations (see test11 in test/Transforms/LICM/sinking.ll).
Reviewers: aprantl, vsk
Reviewed By: aprantl, vsk
Subscribers: JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D48837
llvm-svn: 336191
|
|
|
|
|
|
| |
This reverts commit r336113. It causes crashes.
llvm-svn: 336189
|
|
|
|
|
|
|
|
|
|
| |
The variants added are:
signed Saturating ADD/SUB (immediate) e.g. sqadd z0.h, z0.h, #42
unsigned Saturating ADD/SUB (immediate) e.g. uqadd z0.h, z0.h, #42
signed Saturating ADD/SUB (vectors) e.g. sqadd z0.h, z0.h, z1.h
unsigned Saturating ADD/SUB (vectors) e.g. uqadd z0.h, z0.h, z1.h
llvm-svn: 336186
|
|
|
|
|
|
|
|
|
|
|
| |
Lower more than 4 arguments using stack. This patch targets MIPS32.
It supports only functions with arguments of type i32.
Patch by Petar Avramovic.
Differential Revision: https://reviews.llvm.org/D47934
llvm-svn: 336185
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
unswitching loops.
Original patch trying to address this was sent in D47624, but that
didn't quite handle things correctly. There are two key principles used
to select whether and how to invalidate SCEV-cached information about
loops:
1) We must invalidate any info SCEV has cached before unswitching as we
may change (or destroy) the loop structure by the act of unswitching,
and make it hard to recover everything we want to invalidate within
SCEV.
2) We need to invalidate all of the loops whose CFGs are mutated by the
unswitching. Notably, this isn't the *entire* loop nest, this is
every loop contained by the outermost loop reached by an exit block
relevant to the unswitch.
And we need to do this even when doing trivial unswitching.
I've added more focused tests that directly check that SCEV starts off
with imprecise information and after unswitching (and simplifying
instructions) re-querying SCEV will produce precise information. These
tests also specifically work to check that an *outer* loop's information
becomes precise.
However, the testing here is still a bit imperfect. Crafting test cases
that reliably fail to be analyzed by SCEV before unswitching and succeed
afterward proved ... very, very hard. It took me several hours and
careful work to build these, and I'm not optimistic about necessarily
coming up with more to cover more elaborate possibilities. Fortunately,
the code pattern we are testing here in the pass is really
straightforward and reliable.
Thanks to Max Kazantsev for the initial work on this as well as the
review, and to Hal Finkel for helping me talk through approaches to test
this stuff even if it didn't come to much.
Differential Revision: https://reviews.llvm.org/D47624
llvm-svn: 336183
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contains the following variants:
- Compare with (elements from) other vector
instructions: fcmeq, fcmgt, fcmge, fcmne, fcmuo.
aliases: fcmle, fcmlt.
e.g. fcmle p0.h, p0/z, z0.h, z1.h => fcmge p0.h, p0/z, z1.h, z0.h
- Compare absolute values with (absolute values from) other vector.
instructions: facge, facgt.
aliases: facle, faclt.
e.g. facle p0.h, p0/z, z0.h, z1.h => facge p0.h, p0/z, z1.h, z0.h
- Compare vector elements with #0.0
instructions: fcmeq, fcmgt, fcmge, fcmle, fcmlt, fcmne.
e.g. fcmle p0.h, p0/z, z0.h, #0.0
llvm-svn: 336182
|
|
|
|
|
|
|
|
|
|
| |
DbgLabelInst has no address as its operands.
Differential Revision: https://reviews.llvm.org/D46738
Patch by Hsiangkai Wang.
llvm-svn: 336176
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.
Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri
llvm-svn: 336172
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].
This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process.
—Prior to the patch—
- Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily.
- DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated.
- Functions receiving DT/DDT need to branch a lot which is currently necessary.
- Functions using both DomTree and PostDomTree need to call the update function separately on both trees.
- People need to construct an additional DeferredDominance class to use functions only receiving DDT.
—After the patch—
Patch by Chijun Sima <simachijun@gmail.com>.
Reviewers: kuhar, brzycki, dmgreen, grosser, davide
Reviewed By: kuhar, brzycki
Author: NutshellySima
Subscribers: vsk, mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D48383
llvm-svn: 336163
|
|
|
|
|
|
|
|
|
|
|
| |
other on unqualified max_align_t.
I'll take another stab at this tomorrow. Any ideas for fixing this would be appreciated!
http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/23071/steps/build_Lld/logs/stdio
http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/11185/steps/build-stage1-compiler/logs/stdio
llvm-svn: 336162
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When we import an alias (which will import a copy of the aliasee), but
aren't going to import the aliasee directly, the distributed backend
index will not contain the aliasee summary. Handle this in the summary
assembly printer by printing "null" as the aliasee.
Reviewers: davidxl, dexonsmith
Subscribers: mehdi_amini, inglorion, eraman, steven_wu, llvm-commits
Differential Revision: https://reviews.llvm.org/D48699
llvm-svn: 336160
|
|
|
|
| |
llvm-svn: 336159
|
|
|
|
|
|
| |
This should fix llvm.org/PR37944
llvm-svn: 336157
|
|
|
|
|
|
|
|
| |
The verifier identified several modules that were broken due to incorrect
linkage on declarations. To fix this, CompileOnDemandLayer2::extractFunction
has been updated to change decls to external linkage.
llvm-svn: 336150
|