| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
llvm-svn: 288009
|
|
|
|
|
|
| |
I don't think isel selects these today, favoring adding the register to itself instead. But the load folding tables shouldn't be so concerned with what isel will use and just represent the relationships.
llvm-svn: 288007
|
|
|
|
| |
llvm-svn: 288004
|
|
|
|
|
|
|
|
|
|
| |
instruction's load size is smaller than the register size.
If we were to unfold these, the load size would be increased to the register size. This is not safe to do since the enlarged load can do things like cross a page boundary into a page that doesn't exist.
I probably missed some instructions, but this should be a large portion of them.
llvm-svn: 288001
|
|
|
|
| |
llvm-svn: 287995
|
|
|
|
|
|
|
|
| |
instructions that don't have a restriction.
Most of these are the SSE4.1 PMOVZX/PMOVSX instructions which all read less than 128-bits. The only other was PMOVUPD which by definition is an unaligned load.
llvm-svn: 287991
|
|
|
|
| |
llvm-svn: 287975
|
|
|
|
|
|
| |
folding tables.
llvm-svn: 287974
|
|
|
|
|
|
| |
tables.
llvm-svn: 287972
|
|
|
|
| |
llvm-svn: 287970
|
|
|
|
|
|
|
|
| |
tables for consistency.
Not sure this is truly needed but we had the floating point equivalents, the aligned equivalents, and the EVEX equivalents. So this just makes it complete.
llvm-svn: 287960
|
|
|
|
|
|
| |
alphabetical order. This is consistent with the older sections of the table. NFC
llvm-svn: 287956
|
|
|
|
| |
llvm-svn: 287937
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We did not support subregs in InlineSpiller:foldMemoryOperand() because targets
may not deal with them correctly.
This adds a target hook to let the spiller know that a target can handle
subregs, and actually enables it for x86 for the case of stack slot reloads.
This fixes PR30832.
Differential Revision: https://reviews.llvm.org/D26521
llvm-svn: 287792
|
|
|
|
|
|
|
|
|
|
| |
size as i128mem. Change all uses to the use the i64mem version.
I'm sure this caused the load size to misprint in Intel syntax output. We were also inconsistent about which patterns used which instruction between VEX and EVEX.
There are two different reg/reg versions of movq, one from a GPR and one from the lower 64-bits of an XMM register. This changes the loading folding table to use the single i64mem memory form for folding both cases. But we need to use TB_NO_REVERSE to prevent a duplicate entry in the unfolding table.
llvm-svn: 287622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VPERMI2(B/W/D/Q/PS/PD).
Summary:
The index and one of the table operands can be swapped by changing the opcode to the other version. Neither of these operands are the one that can load from memory so this can't be used to increase memory folding opportunities.
We need to handle the unmasked forms and the kz forms. Since the load operand isn't being commuted we can commute the load and broadcast instructions too.
Reviewers: igorb, delena, Ayal, Farhana, RKSimon
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D25652
llvm-svn: 287621
|
|
|
|
|
|
|
|
|
| |
This seem to fixes PR30992.
- HasAVX512 ? X86::VMOVAPSZ128rm_NOVLX
+ HasAVX512 ? X86::VMOVUPSZ128rm_NOVLX
llvm-svn: 287532
|
|
|
|
|
|
| |
SSE and AVX.
llvm-svn: 287523
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
some bytes
We can replace "scalar" FP-bitwise-logic with other forms of bitwise-logic instructions.
Scalar SSE/AVX FP-logic instructions only exist in your imagination and/or the bowels of
compilers, but logically equivalent int, float, and double variants of bitwise-logic
instructions are reality in x86, and the float variant may be a shorter instruction
depending on which flavor (SSE or AVX) of vector ISA you have...so just prefer float all
the time.
This is a preliminary step towards solving PR6137:
https://llvm.org/bugs/show_bug.cgi?id=6137
Differential Revision:
https://reviews.llvm.org/D26712
llvm-svn: 287122
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vcvtpd2dq/vcvttpd2dq/vcvtpd2ps and similar instructions.
-Don't print the 'x' suffix for the 128-bit reg/mem VEX encoded instructions in Intel syntax. This is consistent with the EVEX versions.
-Don't print the 'y' suffix for the 256-bit reg/reg VEX encoded instructions in Intel or AT&T syntax. This is consistent with the EVEX versions.
-Allow the 'x' and 'y' suffixes to be used for the reg/mem forms when we're assembling using Intel syntax.
-Allow the 'x' and 'y' suffixes on the reg/reg EVEX encoded instructions in Intel or AT&T syntax. This is consistent with what VEX was already allowing.
This should fix at least some of PR28850.
llvm-svn: 286787
|
|
|
|
|
|
|
|
|
| |
represents a relocatable immediate.", with a fix for 32-bit x86.
Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions
that take a global address operand.
llvm-svn: 286420
|
|
|
|
|
|
|
|
| |
Broadcast from memory instructions should be treated as moves. They can't be unfolded.
Fixes pr30693.
llvm-svn: 285998
|
|
|
|
|
|
|
|
| |
legacy intrinsics to select EVEX encoded instructions when available.
This removes a couple tablegen classes that become unused after this change. Another class gained an additional parameter to allow PMADDUBSW to specify a different result type from its input type.
llvm-svn: 285515
|
|
|
|
|
|
|
|
| |
intrinsics can select EVEX encoded instructions when available.
This requires a minor rename of the instructions due to the use of different tablegen classes and how the names are concatenated.
llvm-svn: 285501
|
|
|
|
| |
llvm-svn: 283690
|
|
|
|
| |
llvm-svn: 283689
|
|
|
|
| |
llvm-svn: 283687
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
commutation
MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors.
This patch inserts a vreg copy mi to ensure that the registers are correct.
Fix for PR30607
Differential Revision: https://reviews.llvm.org/D25280
llvm-svn: 283539
|
|
|
|
|
|
|
|
|
| |
(PR26302)"
This is suspected to cause a miscompile in Chromium. Reverting while
investigating.
llvm-svn: 283329
|
|
|
|
|
|
| |
match MOV8rm.
llvm-svn: 283184
|
|
|
|
|
|
| |
I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files.
llvm-svn: 283083
|
|
|
|
|
|
|
|
|
|
|
|
| |
targets
Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements.
We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful
I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets
llvm-svn: 283037
|
|
|
|
| |
llvm-svn: 283004
|
|
|
|
|
|
|
|
|
|
|
| |
We can't use Jcc to leave a Win64 function in general, because that
confuses the unwinder. However, for "leaf" functions, that is, functions
where the return address is always on top of the stack and which don't
have unwind info, it's OK.
Differential Revision: https://reviews.llvm.org/D24836
llvm-svn: 282920
|
|
|
|
|
|
| |
without VLX to teh isFrameLoadOpcode and isFrameStoreOpcode.
llvm-svn: 282842
|
|
|
|
|
|
|
|
| |
This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used.
Fixes one of the cases from PR29112.
llvm-svn: 282690
|
|
|
|
|
|
| |
domain fixing table.
llvm-svn: 282687
|
|
|
|
| |
llvm-svn: 282684
|
|
|
|
|
|
| |
will not return a value greater than 32. I think it theoretically could be 64 for AVX-512.
llvm-svn: 282471
|
|
|
|
|
|
| |
TargetRegisterInfo::getMatchingSuperReg.
llvm-svn: 282359
|
|
|
|
| |
llvm-svn: 282357
|
|
|
|
|
|
| |
hasUndefRegUpdate.
llvm-svn: 282356
|
|
|
|
|
|
| |
floating point. We can use patterns to point to the other instructions instead.
llvm-svn: 282355
|
|
|
|
|
|
|
|
|
|
| |
VPTERNLOG is a ternary instruction with an immediate specifying the logical operation to perform. For each bit position in the 3 source vectors the bit from each source is concatenated together and the resulting 3-bit value is used to select a bit in the immediate. This bit value is written to the result vector.
We can commute this by swapping operands and modifying the immediate. To modify the immediate we need to swap two pairs of bits. The pairs correspond to the locations in the immediate where the commuted operands bits have opposite values and the uncommuted operand has the same value. Bits 0 and 7 will never be swapped since the relevant bits from all sources are the same value.
This refactors and reuses parts of the FMA3 commuting code which is also a three operand instruction.
llvm-svn: 282132
|
|
|
|
|
|
|
|
| |
XMM16-XMM31 or YMM16-YMM31 are the source or dest of the copy and VLX is not supported.
This can happen with SUBREG_TO_REG of ZMM16-ZMM31. Fixes PR30430.
llvm-svn: 281959
|
|
|
|
| |
llvm-svn: 281535
|
|
|
|
|
|
|
| |
analyzeBranch was renamed to use lowercase first, rename
the related set to match.
llvm-svn: 281506
|
|
|
|
|
|
|
|
|
| |
The main change is to return the code size from
InsertBranch/RemoveBranch.
Patch mostly by Tim Northover
llvm-svn: 281505
|
|
|
|
|
|
|
|
|
|
|
| |
r280832 added 32-bit support for emitting conditional tail-calls, but
dropped imp-used parameter registers. This went unnoticed until
r281113, which added 64-bit support, as this is only exposed with
parameter passing via registers.
Don't drop the imp-used parameters.
llvm-svn: 281223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that MachineBasicBlock::reverse_instr_iterator knows when it's at
the end (since r281168 and r281170), implement
MachineBasicBlock::reverse_iterator directly on top of an
ilist::reverse_iterator by adding an IsReverse template parameter to
MachineInstrBundleIterator. This replaces another hard-to-reason-about
use of std::reverse_iterator on list iterators, matching the changes for
ilist::reverse_iterator from r280032 (see the "out of scope" section at
the end of that commit message). MachineBasicBlock::reverse_iterator
now has a handle to the current node and has obvious invalidation
semantics.
r280032 has a more detailed explanation of how list-style reverse
iterators (invalidated when the pointed-at node is deleted) are
different from vector-style reverse iterators like std::reverse_iterator
(invalidated on every operation). A great motivating example is this
commit's changes to lib/CodeGen/DeadMachineInstructionElim.cpp.
Note: If your out-of-tree backend deletes instructions while iterating
on a MachineBasicBlock::reverse_iterator or converts between
MachineBasicBlock::iterator and MachineBasicBlock::reverse_iterator,
you'll need to update your code in similar ways to r280032. The
following table might help:
[Old] ==> [New]
delete &*RI, RE = end() delete &*RI++
RI->erase(), RE = end() RI++->erase()
reverse_iterator(I) std::prev(I).getReverse()
reverse_iterator(I) ++I.getReverse()
--reverse_iterator(I) I.getReverse()
reverse_iterator(std::next(I)) I.getReverse()
RI.base() std::prev(RI).getReverse()
RI.base() ++RI.getReverse()
--RI.base() RI.getReverse()
std::next(RI).base() RI.getReverse()
(For more details, have a look at r280032.)
llvm-svn: 281172
|