summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64] -fpatchable-function-entry=N,0: place patch label after BTIFangrui Song2020-02-031-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after 9a24488cb67a90f889529987275c5e411ce01dda, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680 (cherry picked from commit 06b8e32d4fd3f634f793e3bc0bc4fdb885e7a3ac)
* Revert "Reland: [DWARF] Allow cross-CU references of subprogram definitions"Vedant Kumar2020-01-294-28/+12
| | | | | | | | | | | | | | | ... as well as: Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info" This reverts commit fa4701e1979553c2df61698ac1ac212627630442. This reverts commit 79daafc90308787b52a5d3a7586e82acd5e374b3. There have been reports of this assert getting hit: CalleeDIE && "Could not find DIE for call site entry origin (cherry picked from commit 802bec896171997a7b73dde3857712e0eedeabc1)
* [GlobalMerge] Preserve symbol visibility when merging globalsMichael Spang2020-01-291-0/+2
| | | | | | | | | | | Symbols created for merged external global variables have default visibility. This can break programs when compiling with -Oz -fvisibility=hidden as symbols that should be hidden will be exported at link time. Differential Revision: https://reviews.llvm.org/D73235 (cherry picked from commit a2fb2c0ddca14c133f24d08af4a78b6a3d612ec6)
* Reland "[StackColoring] Remap PseudoSourceValue frame indices via ↵Fangrui Song2020-01-271-7/+9
| | | | | | | | | | | | | | | | MachineFunction::getPSVManager()"" Reland 7a8b0b1595e7dc878b48cf9bbaa652087a6895db, with a fix that checks `!E.value().empty()` to avoid inserting a zero to SlotRemap. Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D73510 (cherry picked from commit 68051c122440b556e88a946bce12bae58fcfccb4) (cherry picked from commit c7c5da6df30141c563e1f5b8ddeabeecdd29e55e)
* [PatchableFunction] Allow empty entry MachineBasicBlockFangrui Song2020-01-241-3/+8
| | | | | | | | Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73301 (cherry picked from commit 50a3ff30e1587235d1830fec9694c1239302ab9f)
* Add function attribute "patchable-function-prefix" to support ↵Fangrui Song2020-01-241-4/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | -fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070 (cherry picked from commit 22467e259507f5ead2a87d989251b4c951a587e4)
* [AsmPrinter] Don't emit __patchable_function_entries entry if ↵Fangrui Song2020-01-241-1/+5
| | | | | | | | "patchable-function-entry"="0" Add improve tests (cherry picked from commit d232c215669cb57f5eb4ead40a4a336220dbc429)
* [CodeGen] Move fentry-insert, xray-instrumentation and patchable-function ↵Fangrui Song2020-01-241-6/+6
| | | | | | | | | | | | | | | before addPreEmitPass() This intention is to move patchable-function before aarch64-branch-targets (configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424). This also allows addPreEmitPass() passes to know the precise instruction sizes if they want. Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1. No output difference with this commit and the previous commit. (cherry picked from commit 9a24488cb67a90f889529987275c5e411ce01dda)
* [PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst.Hiroshi Yamauchi2020-01-231-0/+1
| | | | | | | | | | | | | | | | | Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146 (cherry picked from commit ddbc728828c70728473b47c9f7427aa9514f3d17)
* [StackColoring] Remap FixedStackPseudoSourceValue frame index referenced by ↵Fangrui Song2020-01-211-0/+19
| | | | | | | | | | | | | | | | | MachineMemOperand StackColoring::remapInstructions() remaps MachineOperand frame index (e.g. %stack.1 -> %stack.0) but does not remap FixedStackPseudoSourceValue frame index (e.g. store 4 into %stack.1.ap2.i.i) referenced by MachineMemoryOperand. This can cause an assertion failure when LiveDebugValues references a dead stack object. It is difficult to craft a test case. -g, va_copy and stack-coloring are required. I can only reproduce it on ppc32. (cherry picked from commit eaab1bf21e1d6803fd217fe6052537fc33b06837) (cherry picked from commit 854f7be20a0cb1a95671a16d6cc8200107ee25f4) (cherry picked from commit 7a8b0b1595e7dc878b48cf9bbaa652087a6895db)
* RegisterClassInfo::computePSetLimit - assert that we actually find a register.Simon Pilgrim2020-01-151-0/+1
| | | | Fixes "pointer is null" clang static analyzer warning.
* [Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use ↵David Green2020-01-151-3/+3
| | | | | | | | ScheduleDAGMI. NFC All the callers of this function will be ScheduleDAGMI from the MachineScheduler. This allows us to use the extra info available in ScheduleDAGMI without resorting to awkward casts.
* [codegen,amdgpu] Enhance MIR DIE and re-arrange it for AMDGPU.Michael Liao2020-01-141-0/+9
| | | | | | | | | | | | | | | | | | | Summary: - `dead-mi-elimination` assumes MIR in the SSA form and cannot be arranged after phi elimination or DeSSA. It's enhanced to handle the dead register definition by skipping use check on it. Once a register def is `dead`, all its uses, if any, should be `undef`. - Re-arrange the DIE in RA phase for AMDGPU by placing it directly after `detect-dead-lanes`. - Many relevant tests are refined due to different register assignment. Reviewers: rampitec, qcolombet, sunfish Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72709
* [DAGCombine] Replace `getIntPtrConstant()` with `getVectorIdxTy()`.Michael Liao2020-01-141-1/+2
| | | | | | - Prefer `getVectorIdxTy()` as the index operand type for `EXTRACT_SUBVECTOR` as targets expect different types by overloading `getVectorIdxTy()`.
* [LegalizeTypes] Remove untested code from ExpandIntOp_UINT_TO_FPCraig Topper2020-01-141-70/+2
| | | | | | This code is untested in tree because the "APFloat::semanticsPrecision(sem) >= SrcVT.getSizeInBits() - 1" check is false for most combinations for int and fp types except maybe i32 and f64. For that you would need i32 to be an illegal type, but f64 to be legal and have custom handling for legalizing the split sint_to_fp. The precision check itself was added in 2010 to fix a double rounding issue in the algorithm that would occur if the sint_to_fp was not able to do the conversion without rounding. Differential Revision: https://reviews.llvm.org/D72728
* [MachineScheduler] Reduce reordering due to mem op clusteringJay Foad2020-01-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | Summary: Mem op clustering adds a weak edge in the DAG between two loads or stores that should be clustered, but the direction of this edge is pretty arbitrary (it depends on the sort order of MemOpInfo, which represents the operands of a load or store). This often means that two loads or stores will get reordered even if they would naturally have been scheduled together anyway, which leads to test case churn and goes against the scheduler's "do no harm" philosophy. The fix makes sure that the direction of the edge always matches the original code order of the instructions. Reviewers: atrick, MatzeB, arsenm, rampitec, t.p.northover Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72706
* [AIX][XCOFF] Supporting the ReadOnlyWithRel SectionKnddiggerlin2020-01-141-1/+4
| | | | | | | | | | SUMMARY: In this patch we put the global variable in a Csect which's SectionKind is "ReadOnlyWithRel" into Data Section. Reviewers: hubert.reinterpretcast,jasonliu,Xiangling_L Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D72461
* [FPEnv] Fix chain handling regression after 04a8696Ulrich Weigand2020-01-142-34/+31
| | | | | | | | | | | | | | | | | | | Code in getRoot made the assumption that every node in PendingLoads must always itself have a dependency on the current DAG root node. After the changes in 04a8696, it turns out that this assumption no longer holds true, causing wrong codegen in some cases (e.g. stores after constrained FP intrinsics might get deleted). To fix this, we now need to make sure that the TokenFactor created by getRoot always includes the previous root, if there is no implicit dependency already present. The original getControlRoot code already has exactly this check, so this patch simply reuses that code now for getRoot as well. This fixes the regression. NFC if no constrained FP intrinsic is present.
* Make helper functions static or move them into anonymous namespaces. NFC.Benjamin Kramer2020-01-141-2/+2
|
* Fix "MIParser::getIRValue(unsigned int)’ defined but not used" warning. NFCI.Simon Pilgrim2020-01-141-6/+0
|
* [SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() ↵Simon Pilgrim2020-01-141-11/+12
| | | | | | and generic ISD::SHL handling. As mentioned by @nikic on rGef5debac4302, we can merge the guaranteed bottom zero bits from the shifted value, and then, if a min shift amount is known, zero out the bottom bits as well.
* [SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() ↵Simon Pilgrim2020-01-141-10/+12
| | | | | | | | and generic ISD::SRL handling. As mentioned by @nikic on rGef5debac4302 (although that was just about SHL), we can merge the guaranteed top zero bits from the shifted value, and then, if a min shift amount is known, zero out the top bits as well. SHL tests / handling will be added in a follow up patch.
* [GlobalISel] Change representation of shuffle masks in MachineOperand.Eli Friedman2020-01-137-47/+24
| | | | | | | | | | | | We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663
* Revert "[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto ↵Amy Huang2020-01-131-8/+0
| | | | | | | return type for C++ member functions." This reverts commit c958639098a8702b831952b1a1a677ae19190a55, which causes a crash. See https://reviews.llvm.org/D70524 for details.
* [LegalizeIntegerTypes][X86] Add support for expanding input of ↵Craig Topper2020-01-131-6/+30
| | | | | | | | STRICT_SINT_TO_FP/STRICT_UINT_TO_FP into a libcall. Needed to support i128->fp128 on 32-bit X86. Add full set of strict sint_to_fp/uint_to_fp conversion tests for fp128.
* Rework be15dfa88fb1 such that it works with GlobalISel which doesn't use EVTDaniel Sanders2020-01-131-3/+11
| | | | | | | | | | | | | | | | | | | | | Summary: be15dfa88fb1 broke GlobalISel's usage of getSetCCInverse() which currently appears to be limited to our out-of-tree backend. GlobalISel doesn't use EVT's and isn't able to derive them from the information it has as it doesn't distinguish between integer and floating point types (that distinction is made by operations rather than values). Bring back the bool version of getSetCCInverse() in a way that doesn't break the intent of be15dfa88fb1 but also allows GlobalISel to continue using it. Reviewers: spatel, bogner, arichardson Reviewed By: arichardson Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72309
* [llvm][MIRVRegNamerUtils] Adding hashing on FrameIndex MachineOperands.Puyan Lotfi2020-01-131-1/+2
| | | | | | | | | | | | This patch makes it so that cases where multiple instructions that differ only in their FrameIndex MachineOperand values no longer collide. For instance: %1:_(p0) = G_FRAME_INDEX %stack.0 %2:_(p0) = G_FRAME_INDEX %stack.1 Prior to this patch these instructions would collide together. Differential Revision: https://reviews.llvm.org/D71583
* [SelectionDAG] ComputeNumSignBits add getValidMaximumShiftAmountConstant() ↵Simon Pilgrim2020-01-131-0/+31
| | | | | | for ISD::SHL support Allows us to handle non-uniform SHL shifts to determine the minimum number of sign bits remaining (based off the maximum shift amount value)
* [LegalizeTypes] Add SoftenFloatResult support for ↵Andrew Wei2020-01-141-8/+16
| | | | | | | | | STRICT_SINT_TO_FP/STRICT_UINT_TO_FP Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option. This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP. Differential Revision: https://reviews.llvm.org/D72277
* [SelectionDAG] ComputeNumSignBits add getValidMinimumShiftAmountConstant() ↵Simon Pilgrim2020-01-131-1/+4
| | | | | | ISD::SRA support Allows us to handle more non-uniform SRA sign bits cases
* [Scheduler] Remove superfluous casts. NFCDavid Green2020-01-131-4/+2
|
* [SelectionDAG] ComputeNumSignBits - Use getValidShiftAmountConstant for ↵Simon Pilgrim2020-01-131-15/+8
| | | | | | shift opcodes getValidShiftAmountConstant handles out of bounds shift amounts for us, allowing us to remove the local handling.
* [SelectionDAG] ComputeKnownBits - Add DemandedElts support to ↵Simon Pilgrim2020-01-131-8/+14
| | | | getValidShiftAmountConstant/getValidMinimumShiftAmountConstant()
* [FPEnv] Fix chain handling for fpexcept.strict nodesUlrich Weigand2020-01-132-14/+81
| | | | | | | | | | | | | | | | | We need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts. This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72341
* [SelectionDAG] ComputeKnownBits add getValidMinimumShiftAmountConstant() ↵Simon Pilgrim2020-01-131-0/+3
| | | | | | ISD::SHL support As mentioned on D72573
* [SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in ↵Simon Pilgrim2020-01-131-0/+11
| | | | | | | | | | LSHR/SHL (PR44526) As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value. This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits. Differential Revision: https://reviews.llvm.org/D72573
* [DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return ↵Awanish Pandey2020-01-131-0/+8
| | | | | | | | | | | | | | | | | | type for C++ member functions. Summary: This patch will provide support for auto return type for the C++ member functions. Before this return type of the member function is deduced and stored in the DIE. This patch includes llvm side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524
* __patchable_function_entries: don't use linkage field 'unique' with ↵Fangrui Song2020-01-121-18/+21
| | | | | | | | | | | | -no-integrated-as .section name, "flags"G, @type, GroupName[, linkage] As of binutils 2.33, linkage cannot be 'unique'. For integrated assembler, we use both 'o' flag and 'unique' linkage to support --gc-sections and COMDAT with lld. https://sourceware.org/ml/binutils/2019-11/msg00266.html
* [NFC] Refactor memory ops cluster methodQiu Chaofan2020-01-121-14/+7
| | | | | | | | | | | Current implementation of BaseMemOpsClusterMutation is a little bit obscure. This patch directly uses a map from store chain ID to set of memory instrs to make it simpler, so that future improvements are easier to read, update and review. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D72070
* [LegalizeVectorOps] Parallelize the lo/hi part of STRICT_UINT_TO_FLOAT ↵Craig Topper2020-01-111-3/+6
| | | | | | | legalization. The lo and hi computation are independent. Give them the same input chain and TokenFactor the results together.
* [TargetLowering][X86] Connect the chain from STRICT_FSETCC in ↵Craig Topper2020-01-111-3/+5
| | | | TargetLowering::expandFP_TO_UINT and X86TargetLowering::FP_TO_INTHelper.
* [LegalizeVectorOps] Expand vector MERGE_VALUES immediately.Craig Topper2020-01-111-0/+11
| | | | | | Custom legalization can produce MERGE_VALUES to return multiple results. We can expand them immediately instead of leaving them around for DAG combine to clean up.
* [LegalizeVectorOps] Remove some of the simpler Expand methods. Pass Results ↵Craig Topper2020-01-111-125/+77
| | | | | | | | | | | vector to a couple. NFCI Some of the simplest handlers just call TLI and if that fails, they fall back to unrolling. For those just inline the TLI call and share the unrolling call with the default case of Expand. For ExpandFSUB and ExpandBITREVERSE so that its obvious they don't return results sometimes and want to defer to LegalizeDAG.
* [LegalizeVectorOps] Only pass SDNode* instead SDValue to all of the Expand* ↵Craig Topper2020-01-111-251/+251
| | | | | | | and Promote* methods. All the Expand* and Promote* function assume they are being called with result 0 anyway. Just hardcode result 0 into them.
* moveOperands - assert Src/Dst MachineOperands are non-null.Simon Pilgrim2020-01-111-1/+1
| | | | Fixes static-analyzer warnings.
* [TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from ↵Craig Topper2020-01-102-10/+3
| | | | | | | | | | | | | | | | | | | RunttimeLibcalls.def and all associated usages Summary: This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare. This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code. Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn Reviewed By: efriedma Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72536
* Let targets adjust operand latency of bundlesStanislav Mekhanoshin2020-01-101-1/+6
| | | | | | | | | | | This reverts the AMDGPU DAG mutation implemented in D72487 and gives a more general way of adjusting BUNDLE operand latency. It also replaces FixBundleLatencyMutation with adjustSchedDependency callback in the AMDGPU, fixing not only successor latencies but predecessors' as well. Differential Revision: https://reviews.llvm.org/D72535
* [TargetLowering] Use SelectionDAG::getSetCC and remove a repeated call to ↵Craig Topper2020-01-101-8/+4
| | | | getSetCCResultType in softenSetCCOperands. NFCI
* [TargetLowering][ARM][X86] Change softenSetCCOperands handling of ONE to ↵Craig Topper2020-01-101-10/+9
| | | | | | | | | | | | avoid spurious exceptions for QNANs with strict FP quiet compares ONE is currently softened to OGT | OLT. But the libcalls for OGT and OLT libcalls will trigger an exception for QNAN. At least for X86 with libgcc. UEQ on the other hand uses UO | OEQ. The UO and OEQ libcalls will not trigger an exception for QNAN. This patch changes ONE to use the inverse of the UEQ lowering. So we now produce O & UNE. Technically the existing behavior was correct for a signalling ONE, but since I don't know how to generate one of those from clang that seemed like something we can deal with later as we would need to fix other predicates as well. Also removing spurious exceptions seemed better than missing an exception. There are also problems with quiet OGT/OLT/OLE/OGE, but those are harder to fix. Differential Revision: https://reviews.llvm.org/D72477
* [LegalizeVectorOps] Improve handling of multi-result operations.Craig Topper2020-01-101-173/+271
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This system wasn't very well designed for multi-result nodes. As a consequence they weren't consistently registered in the LegalizedNodes map leading to nodes being revisited for different results. I've removed the "Result" variable from the main LegalizeOp method and used a SDNode* instead. The result number from the incoming Op SDValue is only used for deciding which result to return to the caller. When LegalizeOp is called it should always register a legalized result for all of its results. Future calls for any other result should be pulled for the LegalizedNodes map. Legal nodes will now register all of their results in the map instead of just the one we were called for. The Expand and Promote handling to use a vector of results similar to LegalizeDAG. Each of the new results is then re-legalized and logged in the LegalizedNodes map for all of the Results for the node being legalized. None of the handles register their own results now. And none call ReplaceAllUsesOfValueWith now. Custom handling now always passes result number 0 to LowerOperation. This matches what LegalizeDAG does. Since the introduction of STRICT nodes, I've encountered several issues with X86's custom handling being called with an SDValue pointing at the chain and our custom handlers using that to get a VT instead of result 0. This should prevent us from having any more of those issues. On return we will update the LegalizedNodes map for all results so we shouldn't call the custom handler again for each result number. I want to push SDNode* further into the Expand and Promote handlers, but I've left that for a follow to keep this patch size down. I've created a dummy SDValue(Node, 0) to keep the handlers working. Differential Revision: https://reviews.llvm.org/D72224
OpenPOWER on IntegriCloud