| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we only match build vectors, we can miss some patterns
that use shuffles as seen in the affected tests.
Note that the underlying calls within getSplatSourceVector()
have the potential for compile-time explosion because of
exponential recursion looking through binop opcodes, but
currently the list of supported opcodes is very limited.
Both of those problems should be addressed in follow-up
patches.
llvm-svn: 358984
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
When an LCSSA phi survives through instruction selection, the pass
ends up removing that phi entirely because it is dominated by the
logic that does the lanemask merging.
This then used to trigger an assertion when processing a dependent
phi instruction.
Change-Id: Id4949719f8298062fe476a25718acccc109113b6
Reviewers: llvm-commits
Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, tpr, dstuttard, rtaylor, arsenm
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60999
llvm-svn: 358983
|
| |
|
|
|
|
|
|
|
|
|
| |
Converting InlineCost interface and its internals into CallBase usage.
Inliners themselves are still not converted.
Reviewed By: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60636
llvm-svn: 358982
|
| |
|
|
|
|
|
|
|
|
|
| |
The check for creating CBZ in constant island pass recently obtained the
ability to search backwards to find a Cmp instruction. The code in IfCvt should
mirror this to allow more conversions to the smaller form. The common code has
been pulled out into a separate function to be shared between the two places.
Differential Revision: https://reviews.llvm.org/D60090
llvm-svn: 358977
|
| |
|
|
|
|
|
|
|
|
| |
Ifcvt can replicate instructions as it converts them to be predicated. This
stops that from happening on thumb2 targets at minsize where an extra IT
instruction is likely needed.
Differential Revision: https://reviews.llvm.org/D60089
llvm-svn: 358974
|
| |
|
|
| |
llvm-svn: 358970
|
| |
|
|
| |
llvm-svn: 358969
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The DAGCombiner is rewriting (canonicalizing) an ISD::ADD
with no common bits set in the operands as an ISD::OR node.
This could sometimes result in "missing out" on some
combines that normally are performed for ADD. To be more
specific this could happen if we already have rewritten an
ADD into OR, and later (after legalizations or combines)
we expose patterns that could have been optimized if we
had seen the OR as an ADD (e.g. reassociations based on ADD).
To make the DAG combiner less sensitive to if ADD or OR is
used for these "no common bits set" ADD/OR operations we
now apply most of the ADD combines also to an OR operation,
when value tracking indicates that the operands have no
common bits set.
Reviewers: spatel, RKSimon, craig.topper, kparzysz
Reviewed By: spatel
Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D59758
llvm-svn: 358965
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch provides intrinsics support for Memory Tagging Extension (MTE),
which was introduced with the Armv8.5-a architecture.
The intrinsics are described in detail in the latest
ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest
Reviewed by: David Spickett
Differential Revision: https://reviews.llvm.org/D60486
llvm-svn: 358963
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add missing D and Q lane VLDSTLane lowering
for fp16 elements.
Reviewers: efriedma, kosarev, SjoerdMeijer, ostannard
Reviewed By: efriedma
Subscribers: javed.absar, kristof.beyls, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60874
llvm-svn: 358962
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
About the compressed sections spec says:
(https://docs.oracle.com/cd/E37838_01/html/E36783/section_compression.html)
sh_addralign fields of the section header for a compressed section
reflect the requirements of the compressed section.
Currently, llvm-mc always puts uncompressed section alignment to sh_addralign.
It is not correct. zlib styled section contains an Elfxx_Chdr header,
so we should either use 4 or 8 values depending on the target
(Uncompressed section alignment is stored in ch_addralign field of the compression header).
GNU assembler version 2.31.1 also has this issue,
but in 2.32.51 it was already fixed. This is how it was found
during debugging of the https://bugs.llvm.org/show_bug.cgi?id=40482
actually.
Differential revision: https://reviews.llvm.org/D60965
llvm-svn: 358960
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some circumstances we can end up with setup costs that are very complex to
compute, even though the scevs are not very complex to create. This can also
lead to setupcosts that are calculated to be exactly -1, which LSR treats as an
invalid cost. This patch puts a limit on the recursion depth for setup cost to
prevent them taking too long.
Thanks to @reames for the report and test case.
Differential Revision: https://reviews.llvm.org/D60944
llvm-svn: 358958
|
| |
|
|
|
|
|
|
|
|
|
| |
This change partially reverts https://reviews.llvm.org/D54647 in favor
of bailing out during computeAddress instead.
This catches the condition earlier and handles more cases.
Differential Revision: https://reviews.llvm.org/D60986
llvm-svn: 358948
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts r358910 (git commit 2b744665308fc8d30a3baecb4947f2bd81aa7d30)
While this patch *seems* trivial and safe and correct, it is not. The
copies are actually load bearing copies. You can observe this with MSan
or other ways of checking for use-after-destroy, but otherwise this may
result in ... difficult to debug inexplicable behavior.
I suspect the issue is that the debug location is used after the
original reference to it is removed. The metadata backing it gets
destroyed as its last references goes away, and then we reference it
later through these const references.
llvm-svn: 358940
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to opt in to debug_names in DWARFv5, the IR must contain
'nameTableKind: Default' which also enables debug_pubnames.
Instead, only allow one of {debug_names, apple_names, debug_pubnames,
debug_gnu_pubnames}.
nameTableKind: Default gives debug_names in DWARFv5 and greater,
debug_pubnames in v4 and earlier - and apple_names when tuning for lldb
on MachO.
nameTableKind: GNU always gives gnu_pubnames
llvm-svn: 358931
|
| |
|
|
|
|
|
|
|
|
| |
This was supposed to be NFC, but the change in SDLoc
definitions causes instruction scheduling changes.
There's nothing x86-specific in this code, and it can
likely be used from DAGCombiner's simplifyVBinOp().
llvm-svn: 358930
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Only apply packed literal `op_sel_hi` skipping on operands requiring
packed literals. Even an instruction is `packed`, it may have operand
requiring non-packed literal, such as `v_dot2_f32_f16`.
Reviewers: rampitec, arsenm, kzhuravl
Subscribers: jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60978
llvm-svn: 358922
|
| |
|
|
|
|
|
|
|
|
|
|
| |
If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store.
The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead.
Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store *is undefined*, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date.
Differential Revision: https://reviews.llvm.org/D60659
llvm-svn: 358919
|
| |
|
|
|
|
|
|
| |
from InstCombine
In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask.
llvm-svn: 358913
|
| |
|
|
| |
llvm-svn: 358910
|
| |
|
|
|
|
|
|
|
|
| |
These are inserted after branch relaxation, and for some reason it's
decided to put them in the long branch expansion block. It's probably
not great to rely on the source block address, so this should probably
be switched to being PC relative instead of relying on the block
address
llvm-svn: 358909
|
| |
|
|
|
|
|
|
|
| |
Back in August, r340525 introduced a dependency on the assumption
cache tracker in the ipsccp pass, but that commit missed a call to
INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache
improperly registered if SCCP is the only thing that pulls it in.
llvm-svn: 358903
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
LoopSimplify)
Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways. In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available. So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers.
This patch avoids an invalidation between the two requires in the following trivial pass pipeline:
opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>"
(when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops)
Differential Revision: https://reviews.llvm.org/D60790
llvm-svn: 358901
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to CallInst.
The issue was raised here: https://reviews.llvm.org/D60903#1472783
The function Instruction::updateProfWeight is only used for CallInst in
profile update. From the current interface, it is very easy to think that
the function can also be used for branch instruction. However, Branch
instruction does't need the scaling the function provides for
branch_weights and VP (value profile), in addition, scaling may introduce
inaccuracy for branch probablity.
The patch moves the function updateProfWeight from Instruction class to
CallInst to remove the confusion. The patch also changes the scaling of
branch_weights from a loop to a block because we know that ProfileData
for branch_weights of CallInst will only have two operands at most.
Differential Revision: https://reviews.llvm.org/D60911
llvm-svn: 358900
|
| |
|
|
| |
llvm-svn: 358894
|
| |
|
|
| |
llvm-svn: 358892
|
| |
|
|
| |
llvm-svn: 358891
|
| |
|
|
|
|
|
| |
Effectively reverts r356956. The check for isFullCopy was excessive,
but there still needs to be a check that this is a copy.
llvm-svn: 358890
|
| |
|
|
|
|
|
|
|
|
| |
See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D60624
llvm-svn: 358888
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly.
The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGCombine but it caused a lot of noise on other targets - some improvements, some regressions.
The X86 changes are all definite wins.
Differential Revision: https://reviews.llvm.org/D60462
llvm-svn: 358887
|
| |
|
|
| |
llvm-svn: 358886
|
| |
|
|
| |
llvm-svn: 358884
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Add a getter and setter pair for floating-point accuracy metadata.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60527
llvm-svn: 358883
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch enables passing options to SimpleLoopUnswitch via the passes pipeline.
Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe
Reviewed By: fedor.sergeev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D60676
llvm-svn: 358880
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445.
After thinking about this more, this isn't right, the range is not exact
in the same sense as makeExactICmpRegion(). This needs a separate
function.
llvm-svn: 358876
|
| |
|
|
|
|
|
|
| |
Following D60632 makeGuaranteedNoWrapRegion() always returns an
exact nowrap region. Rename the function accordingly. This is in
line with the naming of makeExactICmpRegion().
llvm-svn: 358875
|
| |
|
|
|
|
| |
not enabled. Same for 256 bit and AVX.
llvm-svn: 358872
|
| |
|
|
| |
llvm-svn: 358869
|
| |
|
|
|
|
|
|
|
|
| |
Section atoms are not sorted, so we need to scan the whole section to find the
start address.
No test case: Found by inspection, and any reproduction would depend on pointer
ordering.
llvm-svn: 358865
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
llvm-undname used to put '\x' in front of every pair of nibbles, but
u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct
for a single character (plus terminating \0) is u\xD7FF instead.
Now, wchar_t, char16_t, and char32_t strings roundtrip from source to
clang-cl (and cl.exe) and then llvm-undname.
(...at least as long as it's not a string like L"\xD7FF" L"foo" which
gets demangled as L"\xD7FFfoo", where the compiler then considers the
"f" as part of the hex escape. That seems ok.)
Also add a comment saying that the "almost-valid" char32_t string I
added in my last commit is actually produced by compilers.
llvm-svn: 358857
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a unsigned with all 4 bytes non-0 was passed to outputHex(), there
were two off-by-ones in it:
- Both MaxPos and Pos left space for the final \0, which left the buffer
one byte to small. Set MaxPos to 16 instead of 15 to fix.
- The `assert(Pos >= 0);` was after a `Pos--`, move it up one line.
Since valid Unicode codepoints are <= 0x10ffff, this could never really
happen in practice.
Found by oss-fuzz.
llvm-svn: 358856
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for uadd_sat and friends to ConstantRange, so we can
handle uadd.sat and friends in LVI. The implementation is forwarding
to the corresponding APInt methods with appropriate bounds.
One thing worth pointing out here is that the handling of wrapping
ranges is not maximally accurate. A simple example is that adding 0
to a wrapped range will return a full range, rather than the original
wrapped range. The tests also only check that the non-wrapping
envelope is correct and minimal.
Differential Revision: https://reviews.llvm.org/D60946
llvm-svn: 358855
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
ConstantRanges have an annoying special case: If upper and lower are
the same, it can be either an empty or a full set. When constructing
constant ranges nearly always a full set is intended, but this still
requires an explicit check in many places.
This revision adds a getNonEmpty() constructor that disambiguates this
case: If upper and lower are the same, a full set is created.
Differential Revision: https://reviews.llvm.org/D60947
llvm-svn: 358854
|
| |
|
|
| |
llvm-svn: 358852
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This does two main things, firstly adding some at least basic addressing modes
for i64 types, and secondly treats floats and doubles sensibly when there is no
fpu. The floating point change can help codesize in some cases, especially with
D60294.
Most backends seems to not consider the exact VT in isLegalAddressingMode,
instead switching on type size. That is now what this does when the target does
not have an fpu (as the float data will be loaded using LDR's). i64's currently
use the address range of an LDRD (even though they may be legalised and loaded
with an LDR). This is at least better than marking them all as illegal
addressing modes.
I have not attempted to do much with vectors yet. That will need changing once
MVE is added.
Differential Revision: https://reviews.llvm.org/D60677
llvm-svn: 358845
|
| |
|
|
|
|
| |
instructions.
llvm-svn: 358844
|
| |
|
|
| |
llvm-svn: 358843
|
| |
|
|
|
|
| |
fpclass only has a single use.
llvm-svn: 358841
|
| |
|
|
|
|
|
|
| |
The error check required FDEs to refer to the most recent CIE, but the eh-frame
spec allows them to refer to any previously seen CIE. This patch removes the
offending check.
llvm-svn: 358840
|
| |
|
|
| |
llvm-svn: 358838
|