summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86InstrInfo.h
Commit message (Collapse)AuthorAgeFilesLines
* [DebugInfo] Make describeLoadedValue() reg awareDavid Stenberg2019-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431
* Revert "[DebugInfo] Make describeLoadedValue() reg aware"David Stenberg2019-12-091-2/+2
| | | | | This reverts commit 3cd93a4efcdeabeb20cb7bec9fbddcb540d337a1. I'll recommit with a well-formatted arcanist commit message.
* [DebugInfo] Make describeLoadedValue() reg awareDavid Stenberg2019-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.
* Use MCRegister in copyPhysRegMatt Arsenault2019-11-111-1/+1
|
* Reland: [TII] Use optional destination and source pair as a return value; NFCDjordje Todorovic2019-11-081-4/+4
| | | | | | | | | | Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622
* Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional ↵Simon Pilgrim2019-10-311-4/+4
| | | | | | destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375
* [TII] Use optional destination and source pair as a return value; NFCDjordje Todorovic2019-10-311-4/+4
| | | | | | | | | | Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622
* Prune two MachineInstr.h includes, fix up depsReid Kleckner2019-10-191-1/+1
| | | | | | | | | | MachineInstr.h included AliasAnalysis.h, which includes a world of IR constructs mostly unneeded in CodeGen. Prune it. Same for DebugInfoMetadata.h. Noticed with -ftime-trace. llvm-svn: 375311
* [TargetInstrInfo] Let findCommutedOpIndices take const MachineInstr&Simon Pilgrim2019-09-251-1/+1
| | | | | | | | | | Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future. Committed on behalf of @hvdijk (Harald van Dijk) Differential Revision: https://reviews.llvm.org/D66138 llvm-svn: 372882
* [X86] Enable commuting of EVEX VCMP for all immediate values during isel.Craig Topper2019-09-171-0/+3
| | | | llvm-svn: 372065
* [X86] Merge X86InstrInfo::loadRegFromAddr/storeRegToAddr into their only ↵Craig Topper2019-08-301-12/+0
| | | | | | | | | | | call site. I'm looking at unfolding broadcast loads on AVX512 which will require refactoring this code to select broadcast opcodes instead of regular load/stores in some cases. Merging them to avoid further complicating their interfaces. llvm-svn: 370484
* [X86][Btver2] Fix latency and throughput of CMPXCHG instructions.Andrea Di Biagio2019-08-201-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Jaguar, CMPXCHG has a latency of 11cy, and a maximum throughput of 0.33 IPC. Throughput is superiorly limited to 0.33 because of the implicit in/out dependency on register EAX. In the case of repeated non-atomic CMPXCHG with the same memory location, store-to-load forwarding occurs and values for sequent loads are quickly forwarded from the store buffer. Interestingly, the functionality in LLVM that computes the reciprocal throughput doesn't seem to know about RMW instructions. That functionality only looks at the "consumed resource cycles" for the throughput computation. It should be fixed/improved by a future patch. In particular, for RMW instructions, that logic should also take into account for the write latency of in/out register operands. An atomic CMPXCHG has a latency of ~17cy. Throughput is also limited to ~17cy/inst due to cache locking, which prevents other memory uOPs to start executing before the "lock releasing" store uOP. CMPXCHG8rr and CMPXCHG8rm are treated specially because they decode to one less macro opcode. Their latency tend to be the same as the other RR/RM variants. RR variants are relatively fast 3cy (but still microcoded - 5 macro opcodes). CMPXCHG8B is 11cy and unfortunately doesn't seem to benefit from store-to-load forwarding. That means, throughput is clearly limited by the in/out dependency on GPR registers. The uOP composition is sadly unknown (due to the lack of PMCs for the Integer pipes). I have reused the same mix of consumed resource from the other CMPXCHG instructions for CMPXCHG8B too. LOCK CMPXCHG8B is instead 18cycles. CMPXCHG16B is 32cycles. Up to 38cycles when the LOCK prefix is specified. Due to the in/out dependencies, throughput is limited to 1 instruction every 32 (or 38) cycles dependeing on whether the LOCK prefix is specified or not. I wouldn't be surprised if the microcode for CMPXCHG16B is similar to 2x microcode from CMPXCHG8B. So, I have speculatively set the JALU01 consumption to 2x the resource cycles used for CMPXCHG8B. The two new hasLockPrefix() functions are used by the btver2 scheduling model check if a MCInst/MachineInst has a LOCK prefix. Calls to hasLockPrefix() have been encoded in predicates of variant scheduling classes that describe lat/thr of CMPXCHG. Differential Revision: https://reviews.llvm.org/D66424 llvm-svn: 369365
* [X86] Use Register/MCRegister in more places in X86Craig Topper2019-08-161-1/+1
| | | | | | | | | | This was a quick pass through some obvious places. I haven't tried the clang-tidy check. I also replaced the zeroes in getX86SubSuperRegister with X86::NoRegister which is the real sentinel name. Differential Revision: https://reviews.llvm.org/D66363 llvm-svn: 369151
* Reland "[DwarfDebug] Dump call site debug info"Djordje Todorovic2019-07-311-0/+3
| | | | | | | | | The build failure found after the rL365467 has been resolved. Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 367446
* Revert "[DwarfDebug] Dump call site debug info"Djordje Todorovic2019-07-121-3/+0
| | | | | | | | A build failure was found on the SystemZ platform. This reverts commit 9e7e73578e54cd22b3c7af4b54274d743b6607cc. llvm-svn: 365886
* [DwarfDebug] Dump call site debug infoDjordje Todorovic2019-07-091-0/+3
| | | | | | | | | | | | | | | | | | | Dump the DWARF information about call sites and call site parameters into debug info sections. The patch also provides an interface for the interpretation of instructions that could load values of a call site parameters in order to generate DWARF about the call site parameters. ([13/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60716 llvm-svn: 365467
* [SystemZ, RegAlloc] Favor 3-address instructions during instruction selection.Jonas Paulsson2019-06-081-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch aims to reduce spilling and register moves by using the 3-address versions of instructions per default instead of the 2-address equivalent ones. It seems that both spilling and register moves are improved noticeably generally. Regalloc hints are passed to increase conversions to 2-address instructions which are done in SystemZShortenInst.cpp (after regalloc). Since the SystemZ reg/mem instructions are 2-address (dst and lhs regs are the same), foldMemoryOperandImpl() can no longer trivially fold a spilled source register since the reg/reg instruction is now 3-address. In order to remedy this, new 3-address pseudo memory instructions are used to perform the folding only when the dst and lhs virtual registers are known to be allocated to the same physreg. In order to not let MachineCopyPropagation run and change registers on these transformed instructions (making it 3-address), a new target pass called SystemZPostRewrite.cpp is run just after VirtRegRewriter, that immediately lowers the pseudo to a target instruction. If it would have been possibe to insert a COPY instruction and change a register operand (convert to 2-address) in foldMemoryOperandImpl() while trusting that the caller (e.g. InlineSpiller) would update/repair the involved LiveIntervals, the solution involving pseudo instructions would not have been needed. This is perhaps a potential improvement (see Phabricator post). Common code changes: * A new hook TargetPassConfig::addPostRewrite() is utilized to be able to run a target pass immediately before MachineCopyPropagation. * VirtRegMap is passed as an argument to foldMemoryOperand(). Review: Ulrich Weigand, Quentin Colombet https://reviews.llvm.org/D60888 llvm-svn: 362868
* [CodeGen] Add "const" to MachineInstr::mayAliasBjorn Pettersson2019-04-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | Summary: The basic idea here is to make it possible to use MachineInstr::mayAlias also when the MachineInstr is const (or the "Other" MachineInstr is const). The addition of const in MachineInstr::mayAlias then rippled down to the need for adding const in several other places, such as TargetTransformInfo::getMemOperandWithOffset. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60856 llvm-svn: 358744
* [X86] Merge the different Jcc instructions for each condition code into ↵Craig Topper2019-04-051-7/+4
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 llvm-svn: 357802
* [X86] Merge the different SETcc instructions for each condition code into ↵Craig Topper2019-04-051-4/+3
| | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between SETcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: andreadb, courbet, RKSimon, spatel, lebedev.ri Reviewed By: andreadb Subscribers: hiraditya, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60138 llvm-svn: 357801
* [X86] Merge the different CMOV instructions for each condition code into ↵Craig Topper2019-04-051-37/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | single instructions that store the condition code as an immediate. Summary: Reorder the condition code enum to match their encodings. Move it to MC layer so it can be used by the scheduler models. This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between CMOV instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. This does complicate the scheduler models a little since we can't assign the A and BE instructions to a separate class now. I plan to make similar changes for SETcc and Jcc. Reviewers: RKSimon, spatel, lebedev.ri, andreadb, courbet Reviewed By: RKSimon Subscribers: gchatelet, hiraditya, kristina, lebedev.ri, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60041 llvm-svn: 357800
* [X86] Enable 8-bit OR with disjoint bits to convert to LEACraig Topper2019-03-051-1/+2
| | | | | | | | We already support 8-bits adds in convertToThreeAddress. But we can also support 8-bit OR if the bits are disjoint. We already do this for 16/32/64. Differential Revision: https://reviews.llvm.org/D58863 llvm-svn: 355423
* X86: Replace isSafeToClobberEFLAGS implementationMatt Arsenault2019-02-151-1/+4
| | | | | | Also use modifiesRegister instead of looping over operands. llvm-svn: 354098
* Update the file headers across all of the LLVM projects in the monorepoChandler Carruth2019-01-191-4/+3
| | | | | | | | | | | | | | | | | to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636
* [x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEASanjay Patel2018-12-121-2/+2
| | | | | | | | | | This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. That allows us to combine add ops with register moves and save some instructions. This is another step towards allowing add truncation in generic DAGCombiner (see D54640). Differential Revision: https://reviews.llvm.org/D55494 llvm-svn: 348946
* [x86] clean up code for converting 16-bit ops to LEA; NFCSanjay Patel2018-12-111-0/+3
| | | | | | | | As discussed in D55494, we want to extend this to handle 8-bit ops too, but that could be extended further to enable this on 32-bit systems too. llvm-svn: 348851
* [x86] don't try to convert add with undef operands to LEASanjay Patel2018-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing code tries to handle an undef operand while transforming an add to an LEA, but it's incomplete because we will crash on the i16 test with the debug output shown below. It's better to just give up instead. Really, GlobalIsel should have folded these before we could get into trouble. # Machine code for function add_undef_i16: NoPHIs, TracksLiveness, Legalized, RegBankSelected, Selected bb.0 (%ir-block.0): liveins: $edi %1:gr32 = COPY killed $edi %0:gr16 = COPY %1.sub_16bit:gr32 %5:gr64_nosp = IMPLICIT_DEF %5.sub_16bit:gr64_nosp = COPY %0:gr16 %6:gr64_nosp = IMPLICIT_DEF %6.sub_16bit:gr64_nosp = COPY %2:gr16 %4:gr32 = LEA64_32r killed %5:gr64_nosp, 1, killed %6:gr64_nosp, 0, $noreg %3:gr16 = COPY killed %4.sub_16bit:gr32 $ax = COPY killed %3:gr16 RET 0, implicit killed $ax # End machine code for function add_undef_i16. *** Bad machine code: Reading virtual register without a def *** - function: add_undef_i16 - basic block: %bb.0 (0x7fe6cd83d940) - instruction: %6.sub_16bit:gr64_nosp = COPY %2:gr16 - operand 1: %2:gr16 LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D54710 llvm-svn: 348722
* [CodeGen][NFC] Make `TII::getMemOpBaseImmOfs` return a base operandFrancis Visoiu Mistrih2018-11-281-3/+3
| | | | | | | | | | | | | | | | | | Currently, instructions doing memory accesses through a base operand that is not a register can not be analyzed using `TII::getMemOpBaseRegImmOfs`. This means that functions such as `TII::shouldClusterMemOps` will bail out on instructions using an FI as a base instead of a register. The goal of this patch is to refactor all this to return a base operand instead of a base register. Then in a separate patch, I will add FI support to the mem op clustering in the MachineScheduler. Differential Revision: https://reviews.llvm.org/D54846 llvm-svn: 347746
* [TableGen] Refactor macro names (NFC)Evandro Menezes2018-11-271-1/+1
| | | | | | Make the names for the macros for `TargetInstrInfo` uniform. llvm-svn: 347706
* [tblgen][PredicateExpander] Add the ability to describe more complex ↵Andrea Di Biagio2018-10-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | constraints on instruction operands. Before this patch, class PredicateExpander only knew how to expand simple predicates that performed checks on instruction operands. In particular, the new scheduling predicate syntax was not rich enough to express checks like this one: Foo(MI->getOperand(0).getImm()) == ExpectedVal; Here, the immediate operand value at index zero is passed in input to function Foo, and ExpectedVal is compared against the value returned by function Foo. While this predicate pattern doesn't show up in any X86 model, it shows up in other upstream targets. So, being able to support those predicates is fundamental if we want to be able to modernize all the scheduling models upstream. With this patch, we allow users to specify if a register/immediate operand value needs to be passed in input to a function as part of the predicate check. Now, register/immediate operand checks all derive from base class CheckOperandBase. This patch also changes where TIIPredicate definitions are expanded by the instructon info emitter. Before, definitions were expanded in class XXXGenInstrInfo (where XXX is a target name). With the introduction of this new syntax, we may want to have TIIPredicates expanded directly in XXXInstrInfo. That is because functions used by the new operand predicates may only exist in the derived class (i.e. XXXInstrInfo). This patch is a non functional change for the existing scheduling models. In future, we will be able to use this richer syntax to better describe complex scheduling predicates, and expose them to llvm-mca. Differential Revision: https://reviews.llvm.org/D53880 llvm-svn: 345714
* Make TargetInstrInfo::isCopyInstr return true for regular COPY-instructionsAlexander Ivchenko2018-08-301-2/+6
| | | | | | | | | | | ..Move all target-dependent checks into new isCopyInstrImpl method. This change allows us to treat MoveReg-type instructions and generic COPY instruction in the same way Differential Revision: https://reviews.llvm.org/D49913 llvm-svn: 341072
* [MinGW] [X86] Add stubs for references to data variables that might end up ↵Martin Storsjo2018-08-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | imported from a dll Variables declared with the dllimport attribute are accessed via a stub variable named __imp_<var>. In MinGW configurations, variables that aren't declared with a dllimport attribute might still end up imported from another DLL with runtime pseudo relocs. For x86_64, this avoids the risk that the target is out of range for a 32 bit PC relative reference, in case the target DLL is loaded further than 4 GB from the reference. It also avoids having to make the text section writable at runtime when doing the runtime fixups, which makes it worthwhile to do for i386 as well. Add stub variables for all dso local data references where a definition of the variable isn't visible within the module, since the DLL data autoimporting might make them imported even though they are marked as dso local within LLVM. Don't do this for variables that actually are defined within the same module, since we then know for sure that it actually is dso local. Don't do this for references to functions, since there's no need for runtime pseudo relocations for autoimporting them; if a function from a different DLL is called without the appropriate dllimport attribute, the call just gets routed via a thunk instead. GCC does something similar since 4.9 (when compiling with -mcmodel=medium or large; from that version, medium is the default code model for x86_64 mingw), but only for x86_64. Differential Revision: https://reviews.llvm.org/D51288 llvm-svn: 340942
* [MI] Change the array of `MachineMemOperand` pointers to beChandler Carruth2018-08-161-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | a generically extensible collection of extra info attached to a `MachineInstr`. The primary change here is cleaning up the APIs used for setting and manipulating the `MachineMemOperand` pointer arrays so chat we can change how they are allocated. Then we introduce an extra info object that using the trailing object pattern to attach some number of MMOs but also other extra info. The design of this is specifically so that this extra info has a fixed necessary cost (the header tracking what extra info is included) and everything else can be tail allocated. This pattern works especially well with a `BumpPtrAllocator` which we use here. I've also added the basic scaffolding for putting interesting pointers into this, namely pre- and post-instruction symbols. These aren't used anywhere yet, they're just there to ensure I've actually gotten the data structure types correct. I'll flesh out support for these in a subsequent patch (MIR dumping, parsing, the works). Finally, I've included an optimization where we store any single pointer inline in the `MachineInstr` to avoid the allocation overhead. This is expected to be the overwhelmingly most common case and so should avoid any memory usage growth due to slightly less clever / dense allocation when dealing with >1 MMO. This did require several ergonomic improvements to the `PointerSumType` to reasonably support the various usage models. This also has a side effect of freeing up 8 bits within the `MachineInstr` which could be repurposed for something else. The suggested direction here came largely from Hal Finkel. I hope it was worth it. ;] It does hopefully clear a path for subsequent extensions w/o nearly as much leg work. Lots of thanks to Reid and Justin for careful reviews and ideas about how to do all of this. Differential Revision: https://reviews.llvm.org/D50701 llvm-svn: 339940
* [MachineOutliner][NFC] Move target frame info into OutlinedFunctionJessica Paquette2018-07-241-2/+2
| | | | | | | | | | | | | | Just some gardening here. Similar to how we moved call information into Candidates, this moves outlined frame information into OutlinedFunction. This allows us to remove TargetCostInfo entirely. Anywhere where we returned a TargetCostInfo struct, we now return an OutlinedFunction. This establishes OutlinedFunctions as more of a general repeated sequence, and Candidates as occurrences of those repeated sequences. llvm-svn: 337848
* [MachineOutliner][NFC] Make Candidates own their call informationJessica Paquette2018-07-241-1/+1
| | | | | | | | | | | | | Before this, TCI contained all the call information for each Candidate. This moves that information onto the Candidates. As a result, each Candidate can now supply how it ought to be called. Thus, Candidates will be able to, say, call the same function in cheaper ways when possible. This also removes that information from TCI, since it's no longer used there. A follow-up patch for the AArch64 outliner will demonstrate this. llvm-svn: 337840
* [MachineOutliner] Fix typo in getOutliningCandidateInfo function nameYvan Roux2018-07-041-1/+1
| | | | | | | | getOutlininingCandidateInfo -> getOutliningCandidateInfo Differential Revision: https://reviews.llvm.org/D48867 llvm-svn: 336285
* [X86] Move the memory unfolding table creation into its own class and make ↵Craig Topper2018-07-011-26/+0
| | | | | | | | | | | | it a ManagedStatic. Also move the static folding tables, their search functions and the new class into new cpp/h files. The unfolding table is effectively static data. It's just a different ordering and a subset of the static folding tables. By putting it in a separate ManagedStatic we ensure we only have one copy instead of one per X86InstrInfo object. This way also makes it only get initialized when really needed. llvm-svn: 336056
* [X86] Use a std::vector for the memory unfolding table.Craig Topper2018-06-291-3/+20
| | | | | | | | | | Previously we used a DenseMap which is costly to set up due to multiple full table rehashes as the size increases and causes the table to be reallocated. This patch changes the table to a vector of structs. We now walk the reg->mem tables and push new entries in the mem->reg table for each row not marked TB_NO_REVERSE. Once all the table entries have been created, we sort the vector. Then we can use a binary search for lookups. Differential Revision: https://reviews.llvm.org/D48585 llvm-svn: 335994
* [MachineOutliner] Define MachineOutliner support in TargetOptionsJessica Paquette2018-06-281-3/+0
| | | | | | | | | | | | | | | Targets should be able to define whether or not they support the outliner without the outliner being added to the pass pipeline. Before this, the outliner pass would be added, and ask the target whether or not it supports the outliner. After this, it's possible to query the target in TargetPassConfig, before the outliner pass is created. This ensures that passing -enable-machine-outliner will not modify the pass pipeline of any target that does not support it. https://reviews.llvm.org/D48683 llvm-svn: 335887
* [X86] Sort the static memory folding tables by reg opcode. Remove the ↵Craig Topper2018-06-251-14/+1
| | | | | | | | | | | | reg->mem DenseMaps in favor of binary search. With the static tables sorted we can binary search them directly for reg->mem lookups. This removes 6 DenseMaps that had to be created when X86InstrInfo is constructed. We still have one Mem->Reg DenseMap for the reverse direction. This is created just as before by walking the reg->mem arrays to populate it. Differential Revision: https://reviews.llvm.org/D48527 llvm-svn: 335501
* [X86] Block commuting operand 1 of FMA*_Int instructions in ↵Craig Topper2018-06-251-29/+4
| | | | | | | | | | | | findThreeSrcCommutedOpIndices. Remove uncommutable returns from getThreeSrcCommuteCase/getFMA3OpcodeToCommuteOperands. We should be blocking the operand while we are in the routine that tries to find commutable operand indices. Doing it later means we might have missed out on another valid set of operands we could have commuted. The intrinsic case was the only case that could really prevent commuting in getFMA3OpcodeToCommuteOperands. All the other cases in getThreeSrcCommuteCase were not reachable conditions as they were protected by findThreeSrcCommutedOpIndices. With that abort case pushed earlier, we can remove all the abort checks and replace with asserts. llvm-svn: 335446
* [X86] Use setcc ISD opcode for AVX512 integer comparisons all the way to iselCraig Topper2018-06-201-0/+4
| | | | | | | | | | I don't believe there is any real reason to have separate X86 specific opcodes for vector compares. Setcc has the same behavior just uses a different encoding for the condition code. I had to change the CondCodeAction for SETLT and SETLE to prevent some transforms from changing SETGT lowering. Differential Revision: https://reviews.llvm.org/D43608 llvm-svn: 335173
* [MachineOutliner] NFC: Remove insertOutlinerPrologue, rename ↵Jessica Paquette2018-06-191-4/+1
| | | | | | | | | | | | insertOutlinerEpilogue insertOutlinerPrologue was not used by any target, and prologue-esque code was beginning to appear in insertOutlinerEpilogue. Refactor that into one function, buildOutlinedFrame. This just removes insertOutlinerPrologue and renames insertOutlinerEpilogue. llvm-svn: 335076
* Change TII isCopyInstr way of returning arguments(NFC)Petar Jovanovic2018-06-061-2/+2
| | | | | | | | | | | Make TII isCopyInstr() return MachineOperands through pointer to pointer instead via reference. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D47364 llvm-svn: 334105
* [MachineOutliner] NFC - Move intermediate data structures to MachineOutliner.hJessica Paquette2018-06-041-8/+6
| | | | | | | | | | | | | | | | | | | | | This is setting up to fix bug 37573 cleanly. This moves data structures that are technically both used in some way by the target and the general-purpose outlining algorithm into MachineOutliner.h. In particular, the `Candidate` class is of importance. Before, the outliner passed the locations of `Candidates` to the target, which would then make some decisions about the prospective outlined function. This change allows us to just pass `Candidates` along to the target. This will allow the target to discard `Candidates` that would be considered unsafe before cost calculation. Thus, we will be able to remove the unsafe candidates described in the bug without resorting to torching the entire prospective function. Also, as a side-effect, it makes the outliner a bit cleaner. https://bugs.llvm.org/show_bug.cgi?id=37573 llvm-svn: 333952
* [X86][MIPS][ARM] New machine instruction property 'isMoveReg'Petar Jovanovic2018-05-231-0/+2
| | | | | | | | | | | | | This property is needed in order to follow values movement between registers. This property is used in TII to implement method that returns true if simple copy like instruction is recognized, along with source and destination machine operands. Patch by Nikola Prica. Differential Revision: https://reviews.llvm.org/D45204 llvm-svn: 333093
* Remove \brief commands from doxygen comments.Adrian Prantl2018-05-011-5/+5
| | | | | | | | | | | | | | | | We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 llvm-svn: 331272
* [X86] Account for partial stack slot spills (PR30821)Warren Ristow2018-04-241-0/+6
| | | | | | | | | | | | | | | | | | | | | | | Previously, _any_ store or load instruction was considered to be operating on a spill if it had a frameindex as an operand, and thus was fair game for optimisations such as "StackSlotColoring". This usually works, except on architectures where spills can be partially restored, for example on X86 where a spilt vector can have a single component loaded (zeroing the rest of the target register). This can be mis-interpreted and the zero extension unsoundly eliminated, see pr30821. To avoid this, this commit optionally provides the caller to isLoadFromStackSlot and isStoreToStackSlot with the number of bytes spilt/loaded by the given instruction. Optimisations can then determine that a full spill followed by a partial load (or vice versa), for example, cannot necessarily be commuted. Patch by Jeremy Morse! Differential Revision: https://reviews.llvm.org/D44782 llvm-svn: 330778
* [MachineOutliner] Add `useMachineOutliner` target hookJessica Paquette2018-04-041-0/+3
| | | | | | | | | | | | | The MachineOutliner has a bunch of target hooks that will call llvm_unreachable if the target doesn't implement them. Therefore, if you enable the outliner on such a target, it'll just crash. It'd be much better if it'd just *not* run the outliner at all in this case. This commit adds a hook to TargetInstrInfo that returns false by default. Targets that implement the hook make it return true. The outliner checks the return value of this hook to decide whether or not to continue. llvm-svn: 329220
* [x86] Expose more of the condition conversion routines in the public APIChandler Carruth2018-04-011-0/+6
| | | | | | | | for X86's instruction information. I've now got a second patch under review that needs these same APIs. This bit is nicely orthogonal and obvious, so landing it. NFC. llvm-svn: 328944
OpenPOWER on IntegriCloud