| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.
This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))
Differential Revision: http://reviews.llvm.org/D13003
llvm-svn: 248210
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The vext pseudo-instruction takes the number of elements that need to be
extracted, not the number of bytes. Hence, use the number of elements
directly instead of scaling them with a factor.
Reviewers: Silviu Baranga, James Molloy
(not reflected in the differential revision)
Differential Revision: http://reviews.llvm.org/D12974
llvm-svn: 248208
|
| |
|
|
|
|
| |
Based on feedback for D13003.
llvm-svn: 248206
|
| |
|
|
|
|
|
|
|
| |
We're currently losing any fast-math flags when synthesizing fcmps for
min/max reductions. In LV, make sure we copy over the scalar inst's
flags. In LoopUtils, we know we only ever match patterns with
hasUnsafeAlgebra, so apply that to any synthesized ops.
llvm-svn: 248201
|
| |
|
|
|
|
| |
Because mod is always exact, this function should have never taken a rounding mode argument. The actual implementation still has issues, which I'll look at resolving in a subsequent patch.
llvm-svn: 248195
|
| |
|
|
| |
llvm-svn: 248190
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The definition of the DivergenceAnalysis pass was in a CPP
file and wasn't accessible to users of the analysis to get it
through "getAnalysis<>()".
This patch extracts the definition into a separate header that
can be used by users of the analysis to fetch the results.
Patch by Volkan Keles (vkeles@apple.com)
llvm-svn: 248186
|
| |
|
|
|
|
| |
This fixes problems where two nodes have persistent debug id 0 assigned.
llvm-svn: 248182
|
| |
|
|
|
|
|
|
| |
evaluate whether 'readonly' or 'readnone' apply to a given function.
This both reduces indentation and will make it easy to share the logic
with a new pass manager implementation.
llvm-svn: 248181
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The ISD::FPOW and ISD::FSINCOS opcodes default to Legal, but there
is no legal instruction for those on SystemZ. This could cause
LLVM internal errors. Fixed by setting the operation action to
Expand for those opcodes.
Also added test cases for all other LLVM IR intrinsics that should
generate a library call. (Those already work correctly since the
default operation action is fine.)
llvm-svn: 248180
|
| |
|
|
|
|
|
|
| |
This was committed without the code review (http://reviews.llvm.org/D12937) being approved.
This reverts commit r248152.
llvm-svn: 248174
|
| |
|
|
| |
llvm-svn: 248172
|
| |
|
|
|
|
| |
This is more efficient for cases like D12965 where we already have widths.
llvm-svn: 248170
|
| |
|
|
|
|
|
|
|
| |
If storing multiple FP constants, some subset of the stores
would be replaced with integers due to visit order, so
MergeConsecutiveStores would only partially merge
these.
llvm-svn: 248169
|
| |
|
|
| |
llvm-svn: 248168
|
| |
|
|
| |
llvm-svn: 248166
|
| |
|
|
|
|
|
|
|
|
| |
No functional change intended.
Patch by Haicheng Wu <haicheng@codeaurora.org>!
http://reviews.llvm.org/D12887
PR24522
llvm-svn: 248164
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the availability of DSP instructions (ACLE 6.4.7) is handled in a
hand-rolled tricky condition block in tools/clang/lib/Basic/Targets.cpp, with
a FIXME: attached.
This patch changes the handling of +t2dsp to be in line with other
architecture extensions.
Following review comments, also updating the description of FeatureDSPThumb2
in ARM.td.
Differential Revision: http://reviews.llvm.org/D12937
llvm-svn: 248152
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D12524
llvm-svn: 248147
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Also tightened up the test and made a trivial fix to prevent double-newline
after emitting .cpsetup directives.
Reviewers: vkalintiris
Subscribers: seanbruno, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D12956
llvm-svn: 248143
|
| |
|
|
|
|
| |
extra times. NFC
llvm-svn: 248140
|
| |
|
|
|
|
| |
coding standards. NFC
llvm-svn: 248136
|
| |
|
|
|
|
| |
instead. NFC
llvm-svn: 248135
|
| |
|
|
| |
llvm-svn: 248131
|
| |
|
|
|
|
| |
Whitespace-only change.
llvm-svn: 248130
|
| |
|
|
|
|
|
|
| |
Added tests for intrinsics and encoding.
Differential Revision: http://reviews.llvm.org/D12593
llvm-svn: 248121
|
| |
|
|
|
|
|
|
|
|
| |
add scalar FP to Int conversion with truncation intrinsics
add scalar conversion FP32 from/to FP64 intrinsics
add rounding mode and SAE mode encoding for these intrinsics
Differential Revision: http://reviews.llvm.org/D12665
llvm-svn: 248117
|
| |
|
|
|
|
|
|
| |
Added tests for intrinsics and encoding.
Differential Revision: http://reviews.llvm.org/D12102
llvm-svn: 248116
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D12931
llvm-svn: 248115
|
| |
|
|
|
|
|
|
|
|
|
|
| |
and avx512
The operation action for i32 and i64 cannot be set to legal, as long double
needs custom lowering.
Patch by: mitch.l.bodart@intel.com
Differential Revision: http://reviews.llvm.org/D12372
llvm-svn: 248114
|
| |
|
|
|
|
|
|
|
|
| |
vshufi32x4
Added tests for intrinsics.
Differential Revision: http://reviews.llvm.org/D12525
llvm-svn: 248113
|
| |
|
|
|
|
| |
Only changes comments.
llvm-svn: 248112
|
| |
|
|
|
|
|
|
|
| |
vinserti64x4, vinserti64x2, vinserti32x8, vinserti32x4, vinsertf64x4, vinsertf64x2, vinsertf32x8, vinsertf32x4
Added tests for encoding, lowering and intrinsics.
Differential Revision: http://reviews.llvm.org/D11893
llvm-svn: 248111
|
| |
|
|
|
|
| |
clang-format a line which was poorly formatted. NFC.
llvm-svn: 248110
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because -indvars widens induction variables through arithmetic,
`NeverNegative` cannot be a property of the `WidenIV` (a `WidenIV`
manages information for all transitive uses of an IV being widened,
including uses of `-1 * IV`). Instead it must live on `NarrowIVDefUse`
which manages information for a specific def-use edge in the transitive
use list of an induction variable.
This change also adds a test case that demonstrates the problem with
r248045.
llvm-svn: 248107
|
| |
|
|
|
|
|
|
|
|
| |
Now that we have fast vector CTPOP implementations we can use this to speed up vector CTTZ using the pattern (cttz(x) = ctpop((x & -x) - 1))
Additionally, for AVX512CD that provides lzcnt instructions we can use the pattern (cttz_undef(x) = (width - 1) - ctlz(x & -x))
Differential Revision: http://reviews.llvm.org/D12663
llvm-svn: 248091
|
| |
|
|
|
|
| |
Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680.
llvm-svn: 248089
|
| |
|
|
|
|
|
| |
getCFGStructurizerRegClass is not used for SI, so
move it into R600 specific stuff.
llvm-svn: 248087
|
| |
|
|
| |
llvm-svn: 248086
|
| |
|
|
|
|
| |
Differential Revision: http://reviews.llvm.org/D12924
llvm-svn: 248084
|
| |
|
|
| |
llvm-svn: 248082
|
| |
|
|
|
|
|
|
|
| |
(icmp eq (ashr C1, %V) -1) may have multiple answers if C1 is not a
power of two and has the sign bit set.
This fixes PR24873.
llvm-svn: 248074
|
| |
|
|
|
|
| |
correctly
llvm-svn: 248073
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
If an induction variable is provably non-negative, its sign extension is
equal to its zero extension. This means narrow uses like
icmp slt iNarrow %indvar, %rhs
can be widened into
icmp slt iWide zext(%indvar), sext(%rhs)
Reviewers: atrick, mcrosier, hfinkel
Subscribers: hfinkel, reames, llvm-commits
Differential Revision: http://reviews.llvm.org/D12745
llvm-svn: 248045
|
| |
|
|
|
|
|
|
| |
In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue.
Differential Revision: http://reviews.llvm.org/D12513
llvm-svn: 248030
|
| |
|
|
|
|
|
|
| |
later as that's all that is tested right now.
Fixes PR24858.
llvm-svn: 248027
|
| |
|
|
|
|
| |
FindAvailableLoadedValue()'s parameter MaxInstsToScan. (Complete version of r247497. See D12886)
llvm-svn: 248022
|
| |
|
|
|
|
| |
At least...a little bit.
llvm-svn: 248020
|
| |
|
|
|
|
|
|
|
|
| |
scaled by a probability to avoid precision issue.
In ARMBaseInstrInfo::isProfitableToIfCvt(), there is a simple cost model in which the number of cycles is scaled by a probability to estimate the cost. However, when the number of cycles is small (which is usually the case), there is a precision issue after the computation. To avoid this issue, this patch scales those cycles by 1024 (chosen to make the multiplication a litter faster) before they are scaled by the probability. Other variables are also scaled up for the final comparison.
Differential Revision: http://reviews.llvm.org/D12742
llvm-svn: 248018
|
| |
|
|
|
|
|
|
|
| |
They mostly clutter the output while it is still possible to see which
node has multiple users without them.
Differential Revision: http://reviews.llvm.org/D12569
llvm-svn: 248013
|