| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
There's a possible missing fold here for extracting from the
same source vector. It's similar to a check that we use to
squash a build vector with all extracted elements from the
same source vector.
llvm-svn: 361778
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- The current implementation simplifies the case where the source of
`copyto` is `implicit-def`ed. However, it only works when that
`implicit-def` is single-used since it detects that from
`implicit-def` and cannot determine which destination vreg should be
used if there are multiple uses.
- This patch changes that detection when `copyto` is being emitted. If
that `copyto`'s source is defined from `implicit-def`, it simplifies
it. Hence, it works even that `implicit-def` is multi-used.
- Except it simplifies the internal IR, it won't improve the quality of
code generation. However, it helps to detect 'implicit-def` in a
straight-forward manner in some passes, such as `si-i1-copies`. A test
case is added.
Reviewers: sunfish, nhaehnle
Subscribers: jvesely, hiraditya, asbirlea, llvm-commits, yaxunl
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D62342
llvm-svn: 361777
|
| |
|
|
| |
llvm-svn: 361776
|
| |
|
|
|
|
|
|
|
|
| |
AArch64AsmBackend.cpp was not using any APIs from AArch64.h, and was
only including it for transitive dependencies. Doing so is problematic
from include-what-you-use perspective, but it is also a layering issue
(it creates a dependency cycle between the primary AArch64 target
library and the MCTargetDesc library).
llvm-svn: 361774
|
| |
|
|
|
|
| |
The DemandedElts variable is pretty much inert at the moment - the original GetDemandedBits implementation calls it with an 'all ones' DemandedElts value so the function is active and behaves exactly as it used to.
llvm-svn: 361773
|
| |
|
|
|
|
| |
Fixes a large number of warnings in the scan-build report on llvm builds.
llvm-svn: 361772
|
| |
|
|
|
|
|
|
| |
commit:
1a8b2ea611cf4ca7cb09562e0238cfefa27c05b5 Divergence driven ISel. Assign register class for cross block values according to the divergence.
llvm-svn: 361770
|
| |
|
|
|
|
|
|
|
|
| |
See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820
Reviewers: artem.tamazov, arsenm
Differential Revision: https://reviews.llvm.org/D61017
llvm-svn: 361763
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
for.outer:
br for.inner
for.inner:
LI <loop invariant load instruction>
for.inner.latch:
br for.inner, for.outer.latch
for.outer.latch:
br for.outer, for.outer.exit
LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock().
Current algorithm of allLoopPathsLeadToBlock()
1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch
2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false
3. return true
Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this.
Author: Whitney (committed by xingxue)
Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn
Reviewed By: jdoerfert
Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour
Tags: #LLVM
Differential Revision: https://reviews.llvm.org/D62418
llvm-svn: 361762
|
| |
|
|
|
|
| |
Add blank line.
llvm-svn: 361761
|
| |
|
|
|
|
|
| |
We never actually use the Offsets produced by ComputeValueVTs, so remove
them until we need them.
llvm-svn: 361755
|
| |
|
|
|
|
|
|
| |
This is problematic on buildbots, as discussed here: https://reviews.llvm.org/rL361356
It seems like the plan already was to revert, but that hasn't happened yet.
llvm-svn: 361746
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Demangler::parse() for MD5 names would:
1. Put all remaining text into the MD5 name sight unseen
2. Not modify MangledName
This meant that if the demangler recursively called parse() (e.g. in
demangleLocallyScopedNamePiece()), every recursive call that started on
an MD5 name would add all remaining bytes to the output buffer but
only advance the input by a byte. For valid inputs, MD5 types are
never (well, see comments for 2 exceptions) nested, but for invalid
input this could cause memory use quadratic in the input size.
llvm-svn: 361744
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code to preserve LCSSA PHIs currently only properly supports
reduction PHIs and PHIs for values defined outside the latches.
This patch improves the LCSSA PHI handling to cover PHIs for values
defined in the latches.
Fixes PR41725.
Reviewers: efriedma, mcrosier, davide, jdoerfert
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D61576
llvm-svn: 361743
|
| |
|
|
|
|
|
|
|
|
|
| |
The variables in BTF DataSec type encode in-section offset.
R_BPF_NONE should be generated instead of R_BPF_64_32.
Signed-off-by: Yonghong Song <yhs@fb.com>
Differential Revision: https://reviews.llvm.org/D62460
llvm-svn: 361742
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
the correct register classes to the cross block values beforehand. For the divergent targets
same value type requires different register classes dependent on the value divergence.
Reviewers: rampitec, nhaehnle
Differential Revision: https://reviews.llvm.org/D59990
This commit was reverted because of the build failure.
The reason was mlformed patch.
Build failure fixed.
llvm-svn: 361741
|
| |
|
|
|
|
|
|
| |
This fixes a problem where back-pressure increases caused by register
dependencies were not correctly notified if execution was also delayed by memory
dependencies.
llvm-svn: 361740
|
| |
|
|
|
|
|
|
| |
SimplifyDemandedBits. NFCI.
Prep work before adding demanded elts support.
llvm-svn: 361739
|
| |
|
|
|
|
| |
Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage.
llvm-svn: 361738
|
| |
|
|
|
|
|
|
| |
CriticalRegDep has been renamed CriticalDependency, and it is now used by class
Instruction to store information about the critical register dependency and the
critical memory dependency. No functional change intendend.
llvm-svn: 361737
|
| |
|
|
|
|
|
|
| |
They caused the sanitizer builds to fail.
My suspicion is the change the countLeadingZeros().
llvm-svn: 361736
|
| |
|
|
|
|
| |
Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel
llvm-svn: 361734
|
| |
|
|
|
|
|
|
| |
These 3 variables cause quite a few warnings in the scan-build report on llvm.
........
Revert accidental commit.
llvm-svn: 361732
|
| |
|
|
|
|
| |
These 3 variables cause quite a few warnings in the scan-build report on llvm.
llvm-svn: 361731
|
| |
|
|
|
|
|
| |
This was found/reduced from a fuzzer report:
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956
llvm-svn: 361729
|
| |
|
|
|
|
| |
doesn't trigger
llvm-svn: 361728
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than gating on "isSwitchDense" (resulting in necessesarily
sparse lookup tables even when they were generated), always run
this quite cheap transform.
This transform is useful not just for generating tables.
LowerSwitch also wants this: read LowerSwitch.cpp:257.
Be careful to not generate worse code, by introducing a
SubThreshold heuristic.
Instead of just sorting by signed, generalize the finding of the
best base.
And now that it is run unconditionally, do not replicate its
functionality in SwitchToLookupTable (which could use a Sub
when having a hole is smaller, hence the SubThreshold
heuristic located in a single place).
This simplifies SwitchToLookupTable, and fixes
some ugly corner cases due to the use of signed numbers,
such as a table containing i16 32768 and 32769, of which
32769 would be interpreted as -32768, and now the code thinks
the table is size 65536.
(We still use unconditional subtraction when building a single-register mask,
but I think this whole block should go when the more general sparse
map is added, which doesn't leave empty holes in the table.)
And the reason test4 and test5 did not trigger was documented wrong:
it was because they were not considered sufficiently "dense".
Also, fix generation of invalid LLVM-IR: shl by bit-width.
llvm-svn: 361727
|
| |
|
|
|
|
|
|
|
|
| |
and replace with an equilivent countTrailingZeros.
GCD is much more expensive than this, with repeated division.
This depends on D60823
llvm-svn: 361726
|
| |
|
|
|
|
|
|
|
| |
This matches countLeadingOnes() and countTrailingOnes(), and
APInt's countLeadingZeros() and countTrailingZeros().
(as well as __builtin_clzll())
llvm-svn: 361724
|
| |
|
|
|
|
|
|
| |
The implementation in ValueTracking and ConstantRange are equally
powerful, reuse the one in ConstantRange, which will make this easier
to extend.
llvm-svn: 361723
|
| |
|
|
|
|
|
|
|
| |
Extract method to compute overflow based on binop and signedness,
and then make the result handling code generic. This extends the
always-overflow handling to signed muls, but has currently no effect,
as we don't compute always overflow for them (thus NFC).
llvm-svn: 361721
|
| |
|
|
|
|
|
| |
Instead pass binary op and signedness. The extra enum only makes
things more complicated in this case.
llvm-svn: 361720
|
| |
|
|
|
|
|
|
| |
This adds a pattern for fma, similar to the float and double patterns.
Differential Revision: https://reviews.llvm.org/D62330
llvm-svn: 361719
|
| |
|
|
|
|
|
|
|
| |
This add patterns for fp16 round and ceil etc. Same as the float and double
patterns.
Differential Revision: https://reviews.llvm.org/D62326
llvm-svn: 361718
|
| |
|
|
|
|
|
|
|
| |
Promote a number of fp16 math intrinsics to float, so that the relevant float
math routines can be used. Copysign is expanded so as to be handled in-place.
Differential Revision: https://reviews.llvm.org/D62325
llvm-svn: 361717
|
| |
|
|
|
|
|
|
|
|
| |
original vector
We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well.
There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well.
llvm-svn: 361716
|
| |
|
|
|
|
|
|
| |
This adds a pattern for the fabs intrinsic, the same as float and double.
Differential Revision: https://reviews.llvm.org/D62324
llvm-svn: 361715
|
| |
|
|
|
|
|
|
| |
This adds a pattern for the sqrt intrinsic, the same as float and double.
Differential Revision: https://reviews.llvm.org/D62322
llvm-svn: 361714
|
| |
|
|
|
|
|
|
| |
Promote fp16 frem operations on ARM to floats so they call fmodf.
Differential Revision: https://reviews.llvm.org/D62321
llvm-svn: 361713
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: PR41688
Reviewers: spatel, efriedma, craig.topper, hfinkel, reames
Reviewed By: hfinkel
Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61409
llvm-svn: 361707
|
| |
|
|
|
|
|
|
| |
shift(build_vector(),C)
Commonly occurs in sign-extension cases
llvm-svn: 361706
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Allow for retrieving an object file corresponding to an architecture-specific slice in a Mach-O universal binary file.
Reviewers: whitequark, deadalnix
Reviewed By: whitequark
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60378
llvm-svn: 361705
|
| |
|
|
|
|
|
|
|
| |
If we have a known non-nan operand, place it in the second operand
of fmin/fmax that is returned if either operand is nan.
Differential Revision: https://reviews.llvm.org/D62448
llvm-svn: 361704
|
| |
|
|
|
|
|
|
|
| |
Adds support for the uadd.sat family of intrinsics in LVI, based on
ConstantRange methods from D60946.
Differential Revision: https://reviews.llvm.org/D62447
llvm-svn: 361703
|
| |
|
|
|
|
|
|
|
|
| |
The guaranteed no-wrap region is never empty, it always contains at
least zero, so these optimizations don't ever apply.
To make this more obviously true, replace the conversative return
in makeGNWR with an assertion.
llvm-svn: 361698
|
| |
|
|
|
|
|
|
|
|
| |
The test based on PR42010:
https://bugs.llvm.org/show_bug.cgi?id=42010
...may show an inaccuracy for PPC's target defs, but we should not
be so aggressive with an assert here. There's no telling what out-of-tree
targets look like.
llvm-svn: 361696
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0)
as the range of op(%x, %y). This is mainly useful in conjunction with
D60650: If the result of the operation is extracted in a branch guarded
against overflow, then the value of %x will be appropriately constrained
and the result range of the operation will be calculated taking that
into account.
Differential Revision: https://reviews.llvm.org/D60656
llvm-svn: 361693
|
| |
|
|
| |
llvm-svn: 361692
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support LEA64_32r properly.
INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags.
This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg.
One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input.
Differential Revision: https://reviews.llvm.org/D61472
llvm-svn: 361691
|
| |
|
|
|
|
|
|
|
|
| |
models. Add 256-bit fp xor to sandybridge zero idioms
This copies the Sandy Bridge zero idiom support to later CPUs. Adding the AVX2 and AVX512F/VL instructions as appropriate.
Differential Revision: https://reviews.llvm.org/D62360
llvm-svn: 361690
|