| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
This patch brings back the MOV64r0 pseudo instruction for zeroing a 64-bit register. This replaces the SUBREG_TO_REG MOV32r0 sequence we use today. Post register allocation we will rewrite the MOV64r0 to a 32-bit xor with an implicit def of the 64-bit register similar to what we do for the various XMM/YMM/ZMM zeroing pseudos.
My main motivation is to enable the spill optimization in foldMemoryOperandImpl. As we were seeing some code that repeatedly did "xor eax, eax; store eax;" to spill several registers with a new xor for each store. With this optimization enabled we get a store of a 0 immediate instead of an xor. Though I admit the ideal solution would be one xor where there are multiple spills. I don't believe we have a test case that shows this optimization in here. I'll see if I can try to reduce one from the code were looking at.
There's definitely some other machine CSE(and maybe other passes) behavior changes exposed by this patch. So it seems like there might be some other deficiencies in SUBREG_TO_REG handling.
Differential Revision: https://reviews.llvm.org/D52757
llvm-svn: 345165
|
|
|
|
|
|
|
|
|
|
|
|
| |
analyzeBranch()/insertBranch() etc. do not properly deal with an undef
flag on the eflags input and used to produce invalid MIR. I don't see
this ever affecting real world inputs (I don't think it is possible to
produce undef flags with llvm IR), so I simply changed the code to bail
out in this case.
rdar://42122367
llvm-svn: 344970
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rebases and recommits r343520. hwasan should be fixed now and this
shouldn't break the tests anymore.
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).
Differential Revision: https://reviews.llvm.org/D52125
llvm-svn: 343895
|
|
|
|
|
|
|
|
| |
instructions"
This reverts r343520 due to breakage of HWASan tests on Android.
llvm-svn: 343616
|
|
|
|
|
|
|
|
|
|
|
| |
Spill/reload instructions are artificially generated by the compiler and
have no relation to the original source code. So the best thing to do is
not attach any debug location to them (instead of just taking the next
debug location we find on following instructions).
Differential Revision: https://reviews.llvm.org/D52125
llvm-svn: 343520
|
|
|
|
|
|
|
|
| |
will stop making us reach the other report_fatal_error in this function.
There's a conditional report_fatal_error just above this llvm_unreachable. The optimizer when seeing the unreachable removes the conditional and just makes any other error trigger the existing report_fatal_error.
llvm-svn: 343428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the FrameAccess struct that was added to the interface
in D51537, since the PseudoValue from the MachineMemoryOperand
can be safely casted to a FixedStackPseudoSourceValue.
Reviewers: MatzeB, thegameg, javed.absar
Reviewed By: thegameg
Differential Revision: https://reviews.llvm.org/D51617
llvm-svn: 341454
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For instructions that spill/fill to and from multiple frame-indices
in a single instruction, hasStoreToStackSlot and hasLoadFromStackSlot
should return an array of accesses, rather than just the first encounter
of such an access.
This better describes FI accesses for AArch64 (paired) LDP/STP
instructions.
Reviewers: t.p.northover, gberry, thegameg, rengolin, javed.absar, MatzeB
Reviewed By: MatzeB
Differential Revision: https://reviews.llvm.org/D51537
llvm-svn: 341301
|
|
|
|
|
|
|
|
|
|
|
| |
..Move all target-dependent checks into new isCopyInstrImpl method.
This change allows us to treat MoveReg-type instructions and generic
COPY instruction in the same way
Differential Revision: https://reviews.llvm.org/D49913
llvm-svn: 341072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
imported from a dll
Variables declared with the dllimport attribute are accessed via a
stub variable named __imp_<var>. In MinGW configurations, variables that
aren't declared with a dllimport attribute might still end up imported
from another DLL with runtime pseudo relocs.
For x86_64, this avoids the risk that the target is out of range
for a 32 bit PC relative reference, in case the target DLL is loaded
further than 4 GB from the reference. It also avoids having to make the
text section writable at runtime when doing the runtime fixups, which
makes it worthwhile to do for i386 as well.
Add stub variables for all dso local data references where a definition
of the variable isn't visible within the module, since the DLL data
autoimporting might make them imported even though they are marked as
dso local within LLVM.
Don't do this for variables that actually are defined within the same
module, since we then know for sure that it actually is dso local.
Don't do this for references to functions, since there's no need for
runtime pseudo relocations for autoimporting them; if a function from
a different DLL is called without the appropriate dllimport attribute,
the call just gets routed via a thunk instead.
GCC does something similar since 4.9 (when compiling with -mcmodel=medium
or large; from that version, medium is the default code model for x86_64
mingw), but only for x86_64.
Differential Revision: https://reviews.llvm.org/D51288
llvm-svn: 340942
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a generically extensible collection of extra info attached to
a `MachineInstr`.
The primary change here is cleaning up the APIs used for setting and
manipulating the `MachineMemOperand` pointer arrays so chat we can
change how they are allocated.
Then we introduce an extra info object that using the trailing object
pattern to attach some number of MMOs but also other extra info. The
design of this is specifically so that this extra info has a fixed
necessary cost (the header tracking what extra info is included) and
everything else can be tail allocated. This pattern works especially
well with a `BumpPtrAllocator` which we use here.
I've also added the basic scaffolding for putting interesting pointers
into this, namely pre- and post-instruction symbols. These aren't used
anywhere yet, they're just there to ensure I've actually gotten the data
structure types correct. I'll flesh out support for these in
a subsequent patch (MIR dumping, parsing, the works).
Finally, I've included an optimization where we store any single pointer
inline in the `MachineInstr` to avoid the allocation overhead. This is
expected to be the overwhelmingly most common case and so should avoid
any memory usage growth due to slightly less clever / dense allocation
when dealing with >1 MMO. This did require several ergonomic
improvements to the `PointerSumType` to reasonably support the various
usage models.
This also has a side effect of freeing up 8 bits within the
`MachineInstr` which could be repurposed for something else.
The suggested direction here came largely from Hal Finkel. I hope it was
worth it. ;] It does hopefully clear a path for subsequent extensions
w/o nearly as much leg work. Lots of thanks to Reid and Justin for
careful reviews and ideas about how to do all of this.
Differential Revision: https://reviews.llvm.org/D50701
llvm-svn: 339940
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.
Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.
This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.
This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.
As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]
Differential Revision: https://reviews.llvm.org/D50680
llvm-svn: 339740
|
|
|
|
|
|
|
|
|
|
| |
of wrapping it in a SUBREG_TO_REG.
Now we switch to the subregister in expandPostRAPseudos where we already switched the opcode.
This simplifies a few isel patterns that used the pseudo directly. And magically seems to have improved our ability to CSE it in the undef-label.ll test.
llvm-svn: 339496
|
|
|
|
|
|
|
|
|
|
|
|
| |
domain fixing purposes
These instructions perform the same operation, but the semantic of which operand is destroyed is reversed. If the same register is used as both operands we can change the execution domain without worrying about this difference.
Unfortunately, this really only works in cases where the input register is killed by the instruction. If its not killed, the two address isntruction pass inserts a copy that will become a move instruction. This makes the instruction use different physical registers that contain the same data at the time the unpck/movhlps executes. I've considered using a unary pseudo instruction with tied operand to trick the two address instruction pass. We could then expand the pseudo post regalloc to get the same physical register on both inputs.
Differential Revision: https://reviews.llvm.org/D50157
llvm-svn: 338735
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The machine verifier asserts with:
Assertion failed: (isMBB() && "Wrong MachineOperand accessor"), function getMBB, file ../include/llvm/CodeGen/MachineOperand.h, line 542.
It calls analyzeBranch which tries to call getMBB if the opcode is
JMP_1, but in this case we do:
JMP_1 @OUTLINED_FUNCTION
I believe we have to use TAILJMPd64 instead of JMP_1 since JMP_1 is used
with brtarget8.
Differential Revision: https://reviews.llvm.org/D49299
llvm-svn: 338237
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just some gardening here.
Similar to how we moved call information into Candidates, this moves outlined
frame information into OutlinedFunction. This allows us to remove
TargetCostInfo entirely.
Anywhere where we returned a TargetCostInfo struct, we now return an
OutlinedFunction. This establishes OutlinedFunctions as more of a general
repeated sequence, and Candidates as occurrences of those repeated sequences.
llvm-svn: 337848
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TCRETURNr* and fix latent bugs with register class updates.
Summary:
Enabling this fully exposes a latent bug in the instruction folding: we
never update the register constraints for the register operands when
fusing a load into another operation. The fused form could, in theory,
have different register constraints on its operands. And in fact,
TCRETURNm* needs its memory operands to use tailcall compatible
registers.
I've updated the folding code to re-constrain all the registers after
they are mapped onto their new instruction.
However, we still can't enable folding in the general case from
TCRETURNr* to TCRETURNm* because doing so may require more registers to
be available during the tail call. If the call itself uses all but one
register, and the folded load would require both a base and index
register, there will not be enough registers to allocate the tail call.
It would be better, IMO, to teach the register allocator to *unfold*
TCRETURNm* when it runs out of registers (or specifically check the
number of registers available during the TCRETURNr*) but I'm not going
to try and solve that for now. Instead, I've just blocked the forward
folding from r -> m, leaving LLVM free to unfold from m -> r as that
doesn't introduce new register pressure constraints.
The down side is that I don't have anything that will directly exercise
this. Instead, I will be immediately using this it my SLH patch. =/
Still worse, without allowing the TCRETURNr* -> TCRETURNm* fold, I don't
have any tests that demonstrate the failure to update the memory operand
register constraints. This patch still seems correct, but I'm nervous
about the degree of testing due to this.
Suggestions?
Reviewers: craig.topper
Subscribers: sanjoy, mcrosier, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D49717
llvm-svn: 337845
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this, TCI contained all the call information for each Candidate.
This moves that information onto the Candidates. As a result, each Candidate
can now supply how it ought to be called. Thus, Candidates will be able to,
say, call the same function in cheaper ways when possible. This also removes
that information from TCI, since it's no longer used there.
A follow-up patch for the AArch64 outliner will demonstrate this.
llvm-svn: 337840
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
models"
Don't try to generate large PIC code for non-ELF targets. Neither COFF
nor MachO have relocations for large position independent code, and
users have been using "large PIC" code models to JIT 64-bit code for a
while now. With this change, if they are generating ELF code, their
JITed code will truly be PIC, but if they target MachO or COFF, it will
contain 64-bit immediates that directly reference external symbols. For
a JIT, that's perfectly fine.
llvm-svn: 337740
|
|
|
|
|
|
|
|
| |
using VMOVLPS with a modified address.
This required an annoying amount of tablegen multiclass changes to make only VUNPCKHPDZ128rr commutable.
llvm-svn: 337357
|
|
|
|
|
|
|
|
|
|
|
|
| |
operations with AVX512F, but not AVX512DQ.
AVX512F only has integer domain logic instructions. AVX512DQ added FP domain logic instructions.
Execution domain fixing runs before EVEX->VEX. So if we have AVX512F and not AVX512DQ we fail to do execution domain switching of the logic operations. This leads to mismatches in execution domain and more test differences.
This patch adds custom domain fixing that switches EVEX integer logic operations to VEX fp logic operations if XMM16-31 are not used.
llvm-svn: 337137
|
|
|
|
|
|
|
|
| |
The code tried to find the immediate by using getNumOperands() on the MachineInstr, but there might be implicit-defs after the immediate that get counted.
Instead use getNumOperands() from the instruction description which will only count the operands that are defined in the td file.
llvm-svn: 337088
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These changes cover the PR#31399.
Now the ffs(x) function is lowered to (x != 0) ? llvm.cttz(x) + 1 : 0
and it corresponds to the following llvm code:
%cnt = tail call i32 @llvm.cttz.i32(i32 %v, i1 true)
%tobool = icmp eq i32 %v, 0
%.op = add nuw nsw i32 %cnt, 1
%add = select i1 %tobool, i32 0, i32 %.op
and x86 asm code:
bsfl %edi, %ecx
addl $1, %ecx
testl %edi, %edi
movl $0, %eax
cmovnel %ecx, %eax
In this case the 'test' instruction can't be eliminated because
the 'add' instruction modifies the EFLAGS, namely, ZF flag
that is set by the 'bsf' instruction when 'x' is zero.
We now produce the following code:
bsfl %edi, %ecx
movl $-1, %eax
cmovnel %ecx, %eax
addl $1, %eax
Patch by Ivan Kulagin
Reviewers: davide, craig.topper, spatel, RKSimon
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D48765
llvm-svn: 336768
|
|
|
|
|
|
|
|
|
|
| |
BLEND under optsize when the immediate allows it.
Isel currently emits movss/movsd a lot of the time and an accidental double commute turns it into a blend.
Ideally we'd select blend directly in isel under optspeed and not rely on the double commute to create blend.
llvm-svn: 336731
|
|
|
|
|
|
|
|
| |
getOutlininingCandidateInfo -> getOutliningCandidateInfo
Differential Revision: https://reviews.llvm.org/D48867
llvm-svn: 336285
|
|
|
|
|
|
|
|
|
|
| |
search.
I separated out the rounding and broadcast groups into their own tables because it made the ordering in the main table easier.
Further splitting of the tables might make it possible to directly index using bits from the TSFlags, but its probably not worth it right now.
llvm-svn: 336075
|
|
|
|
|
|
|
|
| |
X86InstrInfo::commuteInstructionImpl.
findCommutedOpIndices does the pre-checking for whether commuting is possible. There should be no reason left to fail in commuteInstructionImpl. There was a missing pre-check that I've added there and changed the check to an assert in commuteInstructionImpl.
llvm-svn: 336070
|
|
|
|
|
|
|
|
|
|
|
|
| |
it a ManagedStatic.
Also move the static folding tables, their search functions and the new class into new cpp/h files.
The unfolding table is effectively static data. It's just a different ordering and a subset of the static folding tables.
By putting it in a separate ManagedStatic we ensure we only have one copy instead of one per X86InstrInfo object. This way also makes it only get initialized when really needed.
llvm-svn: 336056
|
|
|
|
|
|
|
|
|
|
| |
getFMA3Group free function. NFCI
The class only exists to hold a DenseMap and is only created as a ManagedStatic. It used to expose a single static method that outside code was expected to use.
This patch moves that static function out of the class and moves it implementation into the cpp file. It can now access the ManagedStatic directly by name without the need for the other static method that accessed the ManagedStatic.
llvm-svn: 336055
|
|
|
|
|
|
|
|
|
|
| |
Previously we used a DenseMap which is costly to set up due to multiple full table rehashes as the size increases and causes the table to be reallocated.
This patch changes the table to a vector of structs. We now walk the reg->mem tables and push new entries in the mem->reg table for each row not marked TB_NO_REVERSE. Once all the table entries have been created, we sort the vector. Then we can use a binary search for lookups.
Differential Revision: https://reviews.llvm.org/D48585
llvm-svn: 335994
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
code models""
Reverting because this is causing failures in the LLDB test suite on
GreenDragon.
LLVM ERROR: unsupported relocation with subtraction expression, symbol
'__GLOBAL_OFFSET_TABLE_' can not be undefined in a subtraction
expression
llvm-svn: 335894
|
|
|
|
|
|
|
| |
These are all benign races and only visible in !NDEBUG. tsan complains
about it, but a simple atomic bool is sufficient to make it happy.
llvm-svn: 335823
|
|
|
|
|
|
|
| |
This is a benign race, but tsan likes to complain about it. Just make it
happy.
llvm-svn: 335788
|
|
|
|
|
|
|
|
|
|
| |
X86InstrFMA3Group.
Nothing was using this relationship. By splitting them we no longer need to worry about register or memory entries being empty in a group.
The memory folding tables in X86InstrInfo.cpp can be used to access this relationship if needed.
llvm-svn: 335694
|
|
|
|
|
|
| |
r335501.
llvm-svn: 335517
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
models"
The large code model allows code and data segments to exceed 2GB, which
means that some symbol references may require a displacement that cannot
be encoded as a displacement from RIP. The large PIC model even relaxes
the assumption that the GOT itself is within 2GB of all code. Therefore,
we need a special code sequence to materialize it:
.LtmpN:
leaq .LtmpN(%rip), %rbx
movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch
addq %rax, %rbx # GOT base reg
From that, non-local references go through the GOT base register instead
of being PC-relative loads. Local references typically use GOTOFF
symbols, like this:
movq extern_gv@GOT(%rbx), %rax
movq local_gv@GOTOFF(%rbx), %rax
All calls end up being indirect:
movabsq $local_fn@GOTOFF, %rax
addq %rbx, %rax
callq *%rax
The medium code model retains the assumption that the code segment is
less than 2GB, so calls are once again direct, and the RIP-relative
loads can be used to access the GOT. Materializing the GOT is easy:
leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg
DSO local data accesses will use it:
movq local_gv@GOTOFF(%rbx), %rax
Non-local data accesses will use RIP-relative addressing, which means we
may not always need to materialize the GOT base:
movq extern_gv@GOTPCREL(%rip), %rax
Direct calls are basically the same as they are in the small code model:
They use direct, PC-relative addressing, and the PLT is used for calls
to non-local functions.
This patch adds reasonably comprehensive testing of LEA, but there are
lots of interesting folding opportunities that are unimplemented.
I restricted the MCJIT/eh-lg-pic.ll test to Linux, since the large PIC
code model is not implemented for MachO yet.
Differential Revision: https://reviews.llvm.org/D47211
llvm-svn: 335508
|
|
|
|
|
|
|
|
|
|
|
|
| |
reg->mem DenseMaps in favor of binary search.
With the static tables sorted we can binary search them directly for reg->mem lookups. This removes 6 DenseMaps that had to be created when X86InstrInfo is constructed.
We still have one Mem->Reg DenseMap for the reverse direction. This is created just as before by walking the reg->mem arrays to populate it.
Differential Revision: https://reviews.llvm.org/D48527
llvm-svn: 335501
|
|
|
|
|
|
|
|
|
|
|
|
| |
findThreeSrcCommutedOpIndices. Remove uncommutable returns from getThreeSrcCommuteCase/getFMA3OpcodeToCommuteOperands.
We should be blocking the operand while we are in the routine that tries to find commutable operand indices. Doing it later means we might have missed out on another valid set of operands we could have commuted.
The intrinsic case was the only case that could really prevent commuting in getFMA3OpcodeToCommuteOperands. All the other cases in getThreeSrcCommuteCase were not reachable conditions as they were protected by findThreeSrcCommutedOpIndices.
With that abort case pushed earlier, we can remove all the abort checks and replace with asserts.
llvm-svn: 335446
|
|
|
|
|
|
| |
a Z to match other EVEX instructions.
llvm-svn: 335428
|
|
|
|
|
|
| |
MCJIT can't handle R_X86_64_GOT64 yet.
llvm-svn: 335300
|
|
|
|
| |
llvm-svn: 335298
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The large code model allows code and data segments to exceed 2GB, which
means that some symbol references may require a displacement that cannot
be encoded as a displacement from RIP. The large PIC model even relaxes
the assumption that the GOT itself is within 2GB of all code. Therefore,
we need a special code sequence to materialize it:
.LtmpN:
leaq .LtmpN(%rip), %rbx
movabsq $_GLOBAL_OFFSET_TABLE_-.LtmpN, %rax # Scratch
addq %rax, %rbx # GOT base reg
From that, non-local references go through the GOT base register instead
of being PC-relative loads. Local references typically use GOTOFF
symbols, like this:
movq extern_gv@GOT(%rbx), %rax
movq local_gv@GOTOFF(%rbx), %rax
All calls end up being indirect:
movabsq $local_fn@GOTOFF, %rax
addq %rbx, %rax
callq *%rax
The medium code model retains the assumption that the code segment is
less than 2GB, so calls are once again direct, and the RIP-relative
loads can be used to access the GOT. Materializing the GOT is easy:
leaq _GLOBAL_OFFSET_TABLE_(%rip), %rbx # GOT base reg
DSO local data accesses will use it:
movq local_gv@GOTOFF(%rbx), %rax
Non-local data accesses will use RIP-relative addressing, which means we
may not always need to materialize the GOT base:
movq extern_gv@GOTPCREL(%rip), %rax
Direct calls are basically the same as they are in the small code model:
They use direct, PC-relative addressing, and the PLT is used for calls
to non-local functions.
This patch adds reasonably comprehensive testing of LEA, but there are
lots of interesting folding opportunities that are unimplemented.
Reviewers: chandlerc, echristo
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D47211
llvm-svn: 335297
|
|
|
|
|
|
|
|
|
|
| |
I don't believe there is any real reason to have separate X86 specific opcodes for vector compares. Setcc has the same behavior just uses a different encoding for the condition code.
I had to change the CondCodeAction for SETLT and SETLE to prevent some transforms from changing SETGT lowering.
Differential Revision: https://reviews.llvm.org/D43608
llvm-svn: 335173
|
|
|
|
|
|
|
|
|
|
|
|
| |
insertOutlinerEpilogue
insertOutlinerPrologue was not used by any target, and prologue-esque code was
beginning to appear in insertOutlinerEpilogue. Refactor that into one function,
buildOutlinedFrame.
This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue.
llvm-svn: 335076
|
|
|
|
|
|
|
|
| |
instead of proxying through X86InstrFMA3Info.
These increases the size of the static tables, but is closer to what we would get if used the autogenerated table directly. This reduces the remaining large deltas between what's in the manual table and what's in the autogenerated table.
llvm-svn: 334915
|
|
|
|
|
|
|
|
| |
tables.
Including more additions for NotMemoryFoldable to remove some entries from the autogenerated table.
llvm-svn: 334898
|
|
|
|
|
|
| |
Not sure any of these matter today because I don't think we ever produce them with IMPLICIT_DEF as an input. But by listing them we don't be suprised in the future.
llvm-svn: 334867
|
|
|
|
|
|
| |
An earlier commit prevented folds from the peephole pass by checking for IMPLICIT_DEF. But later in the pipeline IMPLICIT_DEF just becomes and Undef flag on the input register so we need to check for that case too.
llvm-svn: 334848
|
|
|
|
|
|
|
|
| |
have an undefined register update."
There's a typo causing the build to fail.
llvm-svn: 334803
|
|
|
|
|
|
|
|
| |
register update.
We want to keep the load unfolded so we can use the same register for both sources to avoid a false dependency.
llvm-svn: 334802
|