| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
| |
users with the APInt class methods. NFCI
llvm-svn: 299248
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
A common way to implement nearbyint is by fiddling with the floating
point environment and calling rint. This is used at least by the BSD
libm and musl. As such, canonicalizing the latter to the former will
create infinite loops for libm and generally pessimize performance, at
least when the generic C versions are used.
This change preserves the rint in the libcall translation and also
handles the domain truncation logic, so that rint with float argument
will be reduced to rintf etc.
llvm-svn: 299247
|
| |
|
|
|
|
|
|
|
|
| |
Add a new node to act as a fancy bitcast from f16 operations to
i32 that implicitly zero the high 16-bits of the result.
Alternatively could try making v2f16 legal and canonicalizing
on build_vectors.
llvm-svn: 299246
|
| |
|
|
|
|
|
|
|
| |
R600 uses higher AS number to access kernel parameters
Fixes: r298846
Differential Revision: https://reviews.llvm.org/D31520
llvm-svn: 299245
|
| |
|
|
|
|
|
|
| |
These are the same tests added for x86 with r299238,
but PPC doesn't specify all branches as cheap, so we
see different patterns in tests with branches.
llvm-svn: 299244
|
| |
|
|
|
|
| |
in the multiword case. Rewrite getHiBits to use the class method version of lshr instead of the one in APIntOps. NFCI
llvm-svn: 299243
|
| |
|
|
|
|
| |
corresponding methods on the APInt object should be used instead. NFC
llvm-svn: 299242
|
| |
|
|
| |
llvm-svn: 299241
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This feature enables folding of logical shift operations of up to 3 places into addressing mode on Kryo and Falkor that have a fastpath LSL.
Reviewers: mcrosier, rengolin, t.p.northover
Subscribers: junbuml, gberry, llvm-commits, aemerson
Differential Revision: https://reviews.llvm.org/D31113
llvm-svn: 299240
|
| |
|
|
| |
llvm-svn: 299238
|
| |
|
|
|
|
|
| |
This wasn't covering for the case where you have multiple latches and hence
the use of the same loop-id which needs to be mapped to the same loop-id.
llvm-svn: 299237
|
| |
|
|
| |
llvm-svn: 299235
|
| |
|
|
|
|
|
|
|
|
| |
the immediate encodings the frontend uses based on the _MM_HINT_T0/T1 constant values in clang's headers.
Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too.
Fixes PR32411.
llvm-svn: 299234
|
| |
|
|
|
|
| |
Requested on post commit code review.
llvm-svn: 299232
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded.
Reviewers: eraman, davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31344
llvm-svn: 299228
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D27264
llvm-svn: 299227
|
| |
|
|
| |
llvm-svn: 299224
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementation of TargetInstrInfo::findCommutedOpIndices for MIPS target,
restricting commutativity to second and third operand only for
dpaadd_[su].df instructions therein.
Prior to this change, there were cases where the vector that is to be added
to the dot product of the other two could take a position other than the
first one in the instruction, generating false output in the destination
vector.
Such behavior has been noticed in the two functions generating v2i64 output
values so far. Other ones may exhibit such behavior as well, just not for
the vector operands which are present in the test at the moment.
Tests altered so that the function's first operand is a constant splat so
that it can be loaded with a ldi instruction, since that is the case in
which the erroneous instruction operand placement has occurred. We check
that the register which is present in the ldi instruction is placed as the
first operand in the corresponding dpadd instruction.
Patch by Stefan Maksimovic.
Differential Revision: https://reviews.llvm.org/D30827
llvm-svn: 299223
|
| |
|
|
| |
llvm-svn: 299222
|
| |
|
|
|
|
|
|
| |
ASHR and INSERT_VECTOR_ELT
Followup to D31311
llvm-svn: 299221
|
| |
|
|
|
|
|
|
|
| |
Since LOCR only accepts GR32 virtual registers, its operands must be copied
into this regclass in insertSelect(), when an LOCR is built. Otherwise, the
case where the source operand was GRX32 will produce invalid IR.
Review: Ulrich Weigand
llvm-svn: 299220
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements.
This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1.
I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course.
Followup to D25691.
Differential Revision: https://reviews.llvm.org/D31311
llvm-svn: 299219
|
| |
|
|
| |
llvm-svn: 299218
|
| |
|
|
|
|
| |
This will be used to avoid various call to basename in the asan tests.
llvm-svn: 299216
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Even on older subtargets that lack vector support, there may be vector values
with just one element in the input program. These are converted during DAG
legalization to scalar values.
The pre-legalize SystemZ DAGCombiner methods should in this circumstance not
touch these nodes. This patch adds a check for this in
SystemZTargetLowering::combineEXTRACT_VECTOR_ELT().
Review: Ulrich Weigand
llvm-svn: 299213
|
| |
|
|
|
|
|
| |
Based on post-commit review comments by Chandler Carruth on
https://reviews.llvm.org/D31236. Thanks!
llvm-svn: 299211
|
| |
|
|
|
|
|
|
|
|
|
| |
This fixes tests that do things like
mkdir <dir>
cd <dir>
..
<cmd> *.foo
llvm-svn: 299209
|
| |
|
|
| |
llvm-svn: 299207
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch rL298481 was reverted due to crash on clang-with-lto-ubuntu build.
The reason of the crash was type mismatch between either a or b and RHS in the following situation:
LHS = sext(a +nsw b) > RHS.
This is quite rare, but still possible situation. Normally we need to cast all {a, b, RHS} to their widest type.
But we try to avoid creation of new SCEV that are not constants to avoid initiating recursive analysis that
can take a lot of time and/or cache a bad value for iterations number. To deal with this, in this patch we
reject this case and will not try to analyze it if the type of sum doesn't match with the type of RHS. In this
situation we don't need to create any non-constant SCEVs.
This patch also adds an assertion to the method IsProvedViaContext so that we could fail on it and not
go further into range analysis etc (because in some situations these analyzes succeed even when the passed
arguments have wrong types, what should not normally happen).
The patch also contains a fix for a problem with too narrow scope of the analysis caused by wrong
usage of predicates in recursive invocations.
The regression test on the said failure: test/Analysis/ScalarEvolution/implied-via-addition.ll
Reviewers: reames, apilipenko, anna, sanjoy
Reviewed By: sanjoy
Subscribers: mzolotukhin, mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D31238
llvm-svn: 299205
|
| |
|
|
| |
llvm-svn: 299203
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously compiler often extracted common immediates into specific register, e.g.:
```
%vreg0 = S_MOV_B32 0xff;
%vreg2 = V_AND_B32_e32 %vreg0, %vreg1
%vreg4 = V_AND_B32_e32 %vreg0, %vreg3
```
Because of this SDWA peephole failed to find SDWA convertible pattern. E.g. in previous example this could be converted into 2 SDWA src operands:
```
SDWA src: %vreg2 src_sel:BYTE_0
SDWA src: %vreg4 src_sel:BYTE_0
```
With this change peephole check if operand is either immediate or register that is copy of immediate.
llvm-svn: 299202
|
| |
|
|
|
|
|
|
|
|
| |
computeKnownBitsForTargetNode
Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes.
Differential Revision: https://reviews.llvm.org/D31249
llvm-svn: 299201
|
| |
|
|
| |
llvm-svn: 299197
|
| |
|
|
| |
llvm-svn: 299195
|
| |
|
|
| |
llvm-svn: 299194
|
| |
|
|
|
|
|
| |
Adding some test-cases demonstrating cases that need to be improved.
To be followed by patches that improve these cases.
llvm-svn: 299189
|
| |
|
|
|
|
|
|
|
|
| |
APIntOps::isShiftedMask is.
Did you know that 0 is a shifted mask? But 0x0000ff00 and 0x000000ff aren't? At least we get 0xff000000 right.
I only see one usage of this function in the code base today and its in InstCombine. I think its protected against 0 being misreported as a mask. I guess we just don't have tests for the missed cases.
llvm-svn: 299187
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Triggered by commit r298620: "[LV] Vectorize GEPs".
If we encounter a vector GEP with scalar arguments, we splat the scalar
into a vector of appropriate size before we scatter the argument.
Reviewers: arsenm, mehdi_amini, bkramer
Reviewed By: arsenm
Subscribers: bjope, mssimpso, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D31416
llvm-svn: 299186
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Currently Go binding only has SetCurrentDebugLocation method.
I added GetCurrentDebugLocation method to IRBuilder instance.
I added this because I want to save current debug location, change debug location temporary and restore the saved one finally.
This is useful when source location jumps and goes back after while LLVM IR generation.
I also added tests for this to ir_test.go.
I confirmed that all test passed with this patch based on r298890
Patch by Ryuichi Hayashida!
Differential Revision: https://reviews.llvm.org/D31415
llvm-svn: 299185
|
| |
|
|
| |
llvm-svn: 299184
|
| |
|
|
| |
llvm-svn: 299183
|
| |
|
|
| |
llvm-svn: 299182
|
| |
|
|
| |
llvm-svn: 299180
|
| |
|
|
| |
llvm-svn: 299179
|
| |
|
|
| |
llvm-svn: 299177
|
| |
|
|
| |
llvm-svn: 299172
|
| |
|
|
|
|
| |
http://bb.pgr.jp/builders/i686-mingw32-RA-on-linux/builds/25073/steps/build_llvmclang/logs/stdio
llvm-svn: 299171
|
| |
|
|
| |
llvm-svn: 299169
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InputFiles. NFCI.
Introduce symbol table data structures that can be potentially written to
disk, have the LTO library build those data structures using temporarily
constructed modules and redirect the LTO library implementation to go through
those data structures. This allows us to remove the LLVMContext and Modules
owned by InputFile.
With this change I measured a peak memory consumption decrease from 5.4GB to
2.8GB in a no-op incremental ThinLTO link of Chromium on Linux. The impact on
memory consumption is larger in COFF linkers where we are currently forced
to materialize all metadata in order to read linker options. Peak memory
consumption linking a large piece of Chromium for Windows with full LTO and
debug info decreases from >64GB (OOM) to 15GB.
Part of PR27551.
Differential Revision: https://reviews.llvm.org/D31364
llvm-svn: 299168
|
| |
|
|
|
|
| |
calling mem*/str* inside libFuzzer itself
llvm-svn: 299167
|