| Commit message (Collapse) | Author | Age | Files | Lines |
| ... | |
| |
|
|
|
|
|
|
|
|
|
|
| |
mode for fp->int conversion
When we need to do an fp->int conversion using x87 instructions, we need to temporarily change the rounding mode to 0b11 and perform a store. To do this we save the old value of the fpcw to the stack, then set the fpcw to 0xc7f, do the store, then restore fpcw. But the 0xc7f value forces the exception mask bits 1. While this is what they would be in the default FP environment, as we move to support changing the FP environments, we shouldn't make this assumption.
This patch changes the code to explicitly OR 0xc00 with the old value so that only the rounding mode is changed. Unfortunately, this requires two stack temporaries instead of one. One to hold the old value and one to hold the new value. Without two stack temporaries we would need an additional GPR. We already need one to do the OR operation in. This is similar to what gcc and icc do for this operation. Though they are both better at reusing the stack temporaries when there are multiple truncates in a function(or at least in a basic block)
Differential Revision: https://reviews.llvm.org/D57788
llvm-svn: 354178
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Update Flag when generating cc output.
Fixes PR40737.
Reviewers: rnk, nickdesaulniers, craig.topper, spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58283
llvm-svn: 354163
|
| |
|
|
|
|
|
|
|
|
| |
LowerFP_TO_INT. NFCI
These checks aren't needed on the call to FP_TO_INTHelper from the type legalizer for splitting i64. We always want to use X87 FIST/FISTT to memory there.
Moving up the SSE checks will allow this routine to focus on what it cares about and makes its return semantics cleaner.
llvm-svn: 354161
|
| |
|
|
|
|
|
|
|
|
|
| |
It seems there were some problem with using a .mir test. For some reason
doing '-stop-before=codegenprepare' and then '-start-before=codegenprepare'
on the output .mir file results in the NoVRegs Property after instruction
selection.
Recommitting the same test as an .ll file instead.
llvm-svn: 354160
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
shuffle mask (PR40730)
As detailed on PR40730, we are not correctly filling in the lane shuffle mask (D53148/rL344446) - we fill in for the correct src lane but don't add it to the correct mask element, so any reference to the correct element is likely to see an UNDEF mask index.
This allows constant folding to propagate UNDEFs prior to the lane mask being (correctly) lowered to vperm2f128.
This patch fixes the issue by fully populating the lane shuffle mask - this is more than is necessary (if we only filled in the required mask elements we might be able to match other shuffle instructions - broadcasts etc.), but its the most cautious approach as this needs to be cherrypicked into the 8.0.0 release branch.
Differential Revision: https://reviews.llvm.org/D58237
llvm-svn: 354117
|
| |
|
|
|
|
|
|
| |
Add the opcode for ADDrr / t2ADDrr to the Opcode cache, as we did for
all other opcodes where the handling is otherwise the same between arm
mode and thumb2.
llvm-svn: 354115
|
| |
|
|
|
|
| |
Just like arm mode, but with different opcodes.
llvm-svn: 354113
|
| |
|
|
|
|
|
|
|
|
| |
This patch also introduces the emitAuipcInstPair helper, which is then used
for both emitLoadAddress and emitLoadLocalAddress.
Differential Revision: https://reviews.llvm.org/D55325
Patch by James Clarke.
llvm-svn: 354111
|
| |
|
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D55279
Patch by James Clarke.
llvm-svn: 354110
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
ConvertTruncs is used to replace a trunc for an AND mask, however
this function wasn't working as expected. By performing the change
later, we can create a wide type integer mask instead of a narrow -1
value, which could then be simply removed (incorrectly). Because we
now perform this action later, it's necessary to cache the trunc type
before we perform the promotion.
Differential Revision: https://reviews.llvm.org/D57686
llvm-svn: 354108
|
| |
|
|
|
|
| |
Also use modifiesRegister instead of looping over operands.
llvm-svn: 354098
|
| |
|
|
|
|
|
|
|
| |
This reverts commit aa0b77d3395dc6ab91647138139c1a15a3aa088d.
This fails to pass the machine verifier:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/13579/
llvm-svn: 354096
|
| |
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D57811
llvm-svn: 354085
|
| |
|
|
|
|
|
| |
This allows targets to specify the minimum alignment required for the
load/store.
llvm-svn: 354071
|
| |
|
|
|
|
|
|
|
|
| |
This is basically a pointer typed add, so shouldn't be any different.
This was assuming everything was an SGPR, which is not true.
Also cleanup legality for GEP. I don't seem to be seeing the problem
the hack marking s64 as a legal pointer type the comment mentions.
llvm-svn: 354067
|
| |
|
|
|
|
|
|
|
|
|
| |
Reassociate adds to collect scalar operands in a single
instruction when possible. That will result in a scalar
add followed by vector instead of two vector adds, thus
better utilizing SALU.
Differential Revision: https://reviews.llvm.org/D58220
llvm-svn: 354066
|
| |
|
|
| |
llvm-svn: 354065
|
| |
|
|
| |
llvm-svn: 354042
|
| |
|
|
|
|
|
| |
Review: Ulrich Weigand
https://reviews.llvm.org/D58240
llvm-svn: 354039
|
| |
|
|
|
|
|
|
| |
Select G_PHI for integers for MIPS32.
Differential Revision: https://reviews.llvm.org/D58183
llvm-svn: 354025
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Select G_BR and G_BRCOND for MIPS32.
Unconditional branch G_BR does not have register operand,
for that reason we only add tests.
Since conditional branch G_BRCOND compares register to zero on MIPS32,
explicit extension must be performed on i1 condition in order to set
high bits to appropriate value.
Differential Revision: https://reviews.llvm.org/D58182
llvm-svn: 354022
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Arm peephole optimiser code keeps track of both an MI and a SubAdd that can
be used to optimise away a CMP. In the rare case that both are found and not
ruled-out as valid, we could end up setting the flags on the wrong one.
Instead make sure we are using SubAdd if it exists, as it will be closer to the
CMP.
The testcase here is a little theoretical, with a dead def of cpsr. It should
hopefully show the point.
Differential Revision: https://reviews.llvm.org/D58176
llvm-svn: 354018
|
| |
|
|
|
|
|
|
|
|
| |
fistpl/fisttpl when SSE is enabled.
When SSE is enabled sint_to_fp with i16 is blindly promoted to i32, but that changes the behavior of f80 conversion.
Move the promotion to i16 to LowerFP_TO_INT so we can limit it based on the floating point type.
llvm-svn: 354003
|
| |
|
|
| |
llvm-svn: 353987
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
memset lowering, fix argument types in memcpy lowering, and
test encodings. Depends on D57736.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57791
llvm-svn: 353986
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
implements llvm intrinsics and clang intrinsics for
memory.init and data.drop.
Reviewers: aheejin
Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D57736
llvm-svn: 353983
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This is a follow up to D48580 and D48581 which allows reserving
arbitrary general purpose registers with the exception of registers
with special purpose (X8, X16-X18, X29, X30) and registers used by LLVM
(X0, X19). This change also generalizes some of the existing logic to
rely entirely on values generated from tablegen.
Differential Revision: https://reviews.llvm.org/D56305
llvm-svn: 353957
|
| |
|
|
|
|
| |
Same as arm mode, but slightly different opcodes.
llvm-svn: 353938
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
(add, sub)
Try to use 64-bit SLP vectorization. In addition to horizontal instrs
this change triggers optimizations for partial vector operations (for instance,
using low halfs of 128-bit registers xmm0 and xmm1 to multiply <2 x float> by
<2 x float>).
Fixes llvm.org/PR32433
llvm-svn: 353923
|
| |
|
|
|
|
|
|
| |
on 64-bit targets to match what happens without avx512.
In 64-bit mode prior to avx512 we use Expand, but with avx512 we need to make f32/f64 conversions Legal so we use Custom and then do our own expansion for f80. But this seems to produce codegen differences relative to avx2. This patch corrects this.
llvm-svn: 353921
|
| |
|
|
|
|
|
|
| |
-Pull the final stack load creation from the two callers into the helper.
-Return a single SDValue instead of a std::pair.
-Remove the Replace flag which isn't really needed.
llvm-svn: 353920
|
| |
|
|
|
|
|
|
| |
Subtargets are a function level property, so ideally we would
eliminate everywhere that needs to check the global one. Rename the
function to try avoiding confusion.
llvm-svn: 353900
|
| |
|
|
|
|
|
|
|
|
| |
This was inhibiting inlining of library functions when clang was
invoking the inliner directly. This is covering a bit of a mess with
subtarget feature handling, and this shouldn't be a subtarget
feature. The behavior is different depending on whether you are using
a -mattr flag in clang, or llc, opt.
llvm-svn: 353899
|
| |
|
|
|
|
| |
Hopefully fixes buildbot problems.
llvm-svn: 353898
|
| |
|
|
| |
llvm-svn: 353892
|
| |
|
|
| |
llvm-svn: 353889
|
| |
|
|
|
|
|
|
| |
Fix the undefined behaviour introduced by my previous patch r353865 (left
shifting a potentially negative value), which was caught by the bots that run
UBSan.
llvm-svn: 353874
|
| |
|
|
|
|
|
|
|
|
|
| |
Fix for https://bugs.llvm.org/show_bug.cgi?id=39729.
Rather than adding just a case for v8i8 I'm setting cttz to expand
for all vector types.
Differential Revision: https://reviews.llvm.org/D58008
llvm-svn: 353872
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
isFPImmLegal() has been extended to recognize certain FP immediates that can
be built with VGM (Vector Generate Mask).
These scalar FP immediates (that were previously loaded from the constant
pool) are now selected as VGMF/VGMG in Select().
Review: Ulrich Weigand
https://reviews.llvm.org/D58003
llvm-svn: 353867
|
| |
|
|
| |
llvm-svn: 353865
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This teaches the IRTranslator to emit G_BSWAP when it runs into
Intrinsic::bswap. This allows us to select G_BSWAP for non-vector types in
AArch64.
Add a select-bswap.mir test, and add global isel checks to a couple existing
tests in test/CodeGen/AArch64.
This doesn't handle every bswap case, since some of these rely on known bits
stuff. This just lets us handle the naive case.
Differential Revision: https://reviews.llvm.org/D58081
llvm-svn: 353861
|
| |
|
|
|
|
| |
A more limited version of rL352997 that had to be disabled in rL353198 - allow extension of any 128/256/512 bit vector that at least uses byte sized scalars.
llvm-svn: 353860
|
| |
|
|
|
|
| |
We could deal with it, but there's no real point.
llvm-svn: 353845
|
| |
|
|
|
|
| |
a single opcode using memory VT to distinquish. NFC
llvm-svn: 353798
|
| |
|
|
|
|
|
|
| |
MemIntrinsicSDNodes. Use the MemoryVT instead. NFCI
We already have the memory VT, we can just match from that during isel.
llvm-svn: 353797
|
| |
|
|
| |
llvm-svn: 353754
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
32-bit SSE targets.
We were using DstTy, but that represents the integer type we are converting to which is i64 in this
case. The FLD is part of an intermediate step to get from the SSE registers to the x87 registers.
If the floating point type is f32, the memory operand should reflect a 4 byte access not an 8 byte
access. The store we used to get from SSE to the stack is using the corect size.
While there, consistenly use TheVT in place of Op.getOperand(0).getValueType() throughout the function.
llvm-svn: 353745
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for
- v4s16 <-> v4s32
- v2s64 <-> v2s32
And update tests that use them to show that we generate the correct
instructions.
Differential Revision: https://reviews.llvm.org/D57832
llvm-svn: 353732
|
| |
|
|
|
|
|
|
| |
The PowerPC code generator currently scalarizes vector truncates that would fit in a vector register, resulting in vector extracts, scalar operations, and vector merges. This patch custom lowers a vector truncate that would fit in a register to a vector shuffle instead.
Differential Revision: https://reviews.llvm.org/D56507
llvm-svn: 353724
|
| |
|
|
|
|
|
|
|
|
|
| |
This teaches the legalizer about G_FFLOOR, and lets us select G_FFLOOR in
AArch64.
It updates the existing floating point tests, and adds a select-floor.mir test.
Differential Revision: https://reviews.llvm.org/D57486
llvm-svn: 353722
|