summaryrefslogtreecommitdiffstats
path: root/llvm/lib/CodeGen
Commit message (Collapse)AuthorAgeFilesLines
* Recommit "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores"Kerry McLaughlin2019-12-131-1/+4
| | | | | | | | | | | | | | Updated pred_load patterns added to AArch64SVEInstrInfo.td by this patch to use reg + imm non-temporal loads to fix previous test failures. Original commit message: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above.
* [LiveDebugValues] Omit entry values for DBG_VALUEs with pre-existing expressionsDavid Stenberg2019-12-131-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a quickfix for PR44275. An assertion that checks that the DIExpression is valid failed due to attempting to create an entry value for an indirect parameter. This started appearing after D69028, as the indirect parameter started being represented using an DW_OP_deref, rather than with the DBG_VALUE's second operand, meaning that the isIndirectDebugValue() check in LiveDebugValues did not exclude such parameters. A DIExpression that has an entry value operation can currently not have any other operation, leading to the failed isValid() check. This patch simply makes us stop considering emitting entry values for such parameters. To support such cases I think we at least need to do the following changes: * In DIExpression::isValid(): Remove the limitation that a DW_OP_LLVM_entry_value operation can be the only operation in a DIExpression. * In LiveDebugValues::emitEntryValues(): Create an entry value of size 1, so that it only wraps the register operand, and not the whole pre-existing expression (the DW_OP_deref). * In LiveDebugValues::removeEntryValue(): Check that the new debug value has the same debug expression as the original, rather than checking that the debug expression is empty. * In DwarfExpression::addMachineRegExpression(): Modify the logic so that a DW_OP_reg* expression is emitted for the entry value. That is how GCC emits entry values for indirect parameters. That will currently not happen to due the DW_OP_deref causing the !HasComplexExpression to fail. The LocationKind needs to be changed also, rather than always emitting a DW_OP_stack_value for entry values. There are probably more things I have missed, but that could hopefully be a good starting point for emitting such entry values. Reviewers: djtodoro, aprantl, jmorse, vsk Reviewed By: aprantl, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D71416
* [LegalizeTypes] Remove unnecessary if before calling ReplaceValueWith on the ↵Craig Topper2019-12-131-2/+1
| | | | | | | | | | chain in SoftenFloatRes_LOAD. I believe this is a leftover from when fp128 was softened to fp128 on X86-64. In that case type legalization must have been able to create a load that was the same as N which would make this replacement fail or assert. Since we no longer do that, this check should be unneeded.
* Temporarily revert "NFC: DebugInfo: Refactor RangeSpanList to be a struct, ↵Eric Christopher2019-12-124-11/+19
| | | | | | | | like DebugLocStream::List" as it was causing bot and build failures. This reverts commit 8e04896288d22ed8bef7ac367923374f96b753d6.
* NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::ListDavid Blaikie2019-12-124-19/+11
| | | | | Move these data structures closer together so their emission code can eventually share more of its implementation.
* NFC: DebugInfo: Refactor debug_loc/loclist emission into a common functionDavid Blaikie2019-12-122-21/+15
| | | | | | | | | (except for v4 loclists, which are sufficiently different to not fit well in this generic implementation) In subsequent patches I intend to refactor the DebugLoc and ranges data structures to be more similar so I can common more of the implementation here.
* hwasan: add tag_offset DWARF attribute to optimized debug infoEvgenii Stepanov2019-12-125-1/+32
| | | | | | | | | | | | | | | Summary: Support alloca-referencing dbg.value in hwasan instrumentation. Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in loclist format. Reviewers: pcc Subscribers: srhines, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70753
* Revert "[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc"Sanjay Patel2019-12-121-12/+0
| | | | | This reverts commit 8963332c3327daa652ba3e26d35f9109b6991985. There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.
* [DAGCombiner] fold shift-trunc-shift to shift-mask-truncSanjay Patel2019-12-121-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.
* [DAGCombiner] improve readabilitySanjay Patel2019-12-121-11/+11
| | | | | | | | | | | This is not quite NFC because I changed the SDLoc to use the more standard 'N' (the starting node for the fold). This transform is a special-case of a more general fold that we do in IR, but it seems like the general fold is needed here too to avoid a potential regression seen in D58017. https://rise4fun.com/Alive/3jZm
* [DebugInfo] Prevent invalid fragments at ISel from dropping debug infostozer2019-12-121-1/+7
| | | | | | | | | | | | | During SelectionDAG, if a value which is associated with a DBG_VALUE needs to be split across multiple registers, the DBG_VALUE will be split into a set of fragment expressions to recreate the original value. If one or more of these fragments cannot be created, they would previously be silently dropped, causing the old debug value to live past its expiry date. This patch fixes this issue by keeping invalid fragments while setting their value as Undef. Differential revision: https://reviews.llvm.org/D70248
* Temporarily Revert "[DataLayout] Fix occurrences that size and range of ↵Nicola Zaghen2019-12-121-2/+2
| | | | | | | | | pointers are assumed to be the same." This reverts commit 5f6208778ff92567c57d7c1e2e740c284d7e69a5. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.
* [DataLayout] Fix occurrences that size and range of pointers are assumed to ↵Nicola Zaghen2019-12-121-2/+2
| | | | | | | | | | | | be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!
* [NFC][llvm][MIRVRegNamerUtils] Moving methods around. Making some private.Puyan Lotfi2019-12-122-13/+9
| | | | | Making all externally unused methods private in MIRVRegNamerUtils.h. Moving or deleting a couple other methods around.
* [llvm][MIRVRegNamerUtils] Adding hashing on memoperands.Puyan Lotfi2019-12-111-0/+11
| | | | | | | | No more hash collisions for memoperands. Now the MIRCanonicalization pass shouldn't hit hash collisions when dealing with nearly identical memory accessing instructions when their memoperands are in fact different. Differential Revision: https://reviews.llvm.org/D71328
* [IR] Split out target specific intrinsic enums into separate headersReid Kleckner2019-12-115-1/+8
| | | | | | | | | | | | | | | | | | | | This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320
* Revert "[DWARF] Allow cross-CU references of subprogram definitions"Vedant Kumar2019-12-114-26/+7
| | | | | | | | | This reverts commit 30038da15b18ac4e34b9ea7a648382ae481e4770. It causes the stage2 thinLTO bot to fail with: Assertion failed: (CU.getDIE(CalleeSP) && "Expected declaration subprogram DIE for callee") rdar://57840415
* Revert "[SDAG] remove use restriction in isNegatibleForFree() when called ↵Sanjay Patel2019-12-111-19/+16
| | | | | | | from getNegatedExpression()" This reverts commit d1f0bdf2d2df9bdf11ee2ddfff3df50e53f2f042. The patch can cause infinite loops in DAGCombiner.
* [LegalizeTypes] In SoftenFloatRes_FP_EXTEND, move the check for input ↵Craig Topper2019-12-111-9/+8
| | | | | | | | | already being promoted above the check for fp16 converting to something other than fp32. The fp16 to larger than fp32 inserts an extend that need to re-legalized if fp16 is promoted. But if we check for fp16 promotion first, then we can avoid emiting the fp_extend all together.
* [SDAG] remove use restriction in isNegatibleForFree() when called from ↵Sanjay Patel2019-12-111-16/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch skips the use check during the rewrite phase. So we determine that some expression isNegatibleForFree (identically to without this patch), but during the rewrite, don't rely on use counts to decide how to create the optimal expression. Differential Revision: https://reviews.llvm.org/D70975
* Revert "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores"Kerry McLaughlin2019-12-111-4/+1
| | | | | | | | This reverts commit 3f5bf35f868d1e33cd02a5825d33ed4675be8cb1 as it was causing build failures in llvm-clang-x86_64-expensive-checks: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/392 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1045
* [AArch64][SVE] Implement intrinsics for non-temporal loads & storesKerry McLaughlin2019-12-111-1/+4
| | | | | | | | | | | | | | | | | | | | Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000
* [ARM][LowOverheadLoops] Remove dead loop update instructions.Sjoerd Meijer2019-12-112-1/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007
* [ARM][TypePromotion] Enable by defaultSam Parker2019-12-111-3/+21
| | | | | | | | | Enable the TypePromotion pass my default (again). This patch was originally committed in 393dacacf7e7. This patch was reverted in a38396939c54. Differential Revision: https://reviews.llvm.org/D70998
* [PowerPC] [CodeGen] Use MachineBranchProbabilityInfo in EarlyIfPredicator to ↵shkzhang2019-12-111-3/+7
| | | | | | | | | | | | | | | | | | | avoid the potential bug Summary: In the function `EarlyIfPredicator::shouldConvertIf()`, we call `TII->isProfitableToIfCvt()` with `BranchProbability::getUnknown()`, it may cause the potential assertion error for those hook which use `BranchProbability` in `isProfitableToIfCvt()`, for example `SystemZ`. `SystemZ` use `Probability < BranchProbability(1, 8))` in the function `SystemZInstrInfo::isProfitableToIfCvt()`, if we call this function with `BranchProbability::getUnknown()`, it will cause assertion error. This patch is to fix the potential bug. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D71273
* [LiveRegUnits] Add phys_regs_and_masks iterator range (NFC).Florian Hahn2019-12-112-42/+31
| | | | | | | | | | | This iterator range just includes physical registers and register masks, which are interesting when dealing with register liveness. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70562
* [LegalizeTypes] Remove manual worklist management from SoftenFloatRes_FP_EXTEND.Craig Topper2019-12-103-9/+2
| | | | | | I think this is no longer needed. The system should take care of legalizing any new nodes that are added. I think this might have been needed prior to r371709 or r307053.
* Revert "[DebugInfo] Refactored macro related generation, added a test case ↵Nico Weber2019-12-102-9/+19
| | | | | | | | | | | | | | | | for macinfo.dwo emission." This reverts commit 307f60a1a3ff04313a75e2fc11bc14df4fc2ffb8. DebugInfo/X86/debug-macinfo-split-dwarf.ll fails on Windows: Command Output (stdout): -- $ ":" "RUN: at line 1" $ "c:\src\llvm-project\out\gn\bin\llc.exe" "-mtriple=x86_64-pc-windows-gnu" "-O0" "-split-dwarf-file=foo.dwo" "-filetype=obj" Assertion failed: Section && "Cannot switch to a null section!", file ../../llvm/lib/MC/MCStreamer.cpp, line 1103 Stack dump: 0. Program arguments: c:\src\llvm-project\out\gn\bin\llc.exe -mtriple=x86_64-pc-windows-gnu -O0 -split-dwarf-file=foo.dwo -filetype=obj
* [llvm][MIRVRegNamerUtil] Adding hashing against MachineInstr flags.Puyan Lotfi2019-12-101-1/+1
| | | | | | | | Now, flags will result in differing hashes for a given MI. In effect, if you have two instructions with everything identical except for their flags then you should get two different hashes and fewer collisions. Differential Revision: https://reviews.llvm.org/D70479
* [FPEnv][X86] Constrained FCmp intrinsics enabling on X86Wang, Pengfei2019-12-111-20/+41
| | | | | | | | | | | | Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582
* DebugInfo: Clarify some more reasons v4 loc.dwo can't share much ↵David Blaikie2019-12-101-0/+2
| | | | implementation with loclists.dwo
* [DWARF] Allow cross-CU references of subprogram definitionsVedant Kumar2019-12-104-7/+26
| | | | | | | | | | | | | | | | | | | | | | | | | This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. rdar://46577651 Differential Revision: https://reviews.llvm.org/D70350
* [DebugInfo] Refactored macro related generation, added a test case for ↵Sourabh Singh Tomar2019-12-112-19/+9
| | | | | | | | | | macinfo.dwo emission. Reviewers: dblaikie, aprantl, jini.susan.george Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71008
* Recommit "[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified."Sourabh Singh Tomar2019-12-111-5/+11
| | | | | | | | Reviewers: dblaikie, aprantl, probinson Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71185
* Revert "[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified."Sourabh Singh Tomar2019-12-111-11/+5
| | | | | This reverts commit 6ef01588f4d75ef43da4ed2a37ba7a8b8daab259. Missing Differetial revision.
* [DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified.Sourabh Singh Tomar2019-12-111-5/+11
|
* Revert 30e8f80fd5a4 "[DebugInfo] Don't create multiple DBG_VALUEs when sinking"Hans Wennborg2019-12-101-74/+10
| | | | | | | | | | | | | | | | | | | | This caused non-determinism in the compiler, see command on the Phabricator code review. > This patch addresses a performance problem reported in PR43855, and > present in the reapplication in in 001574938e5. It turns out that > MachineSink will (often) move instructions to the first block that > post-dominates the current block, and then try to sink further. This > means if we have a lot of conditionals, we can needlessly create large > numbers of DBG_VALUEs, one in each block the sunk instruction passes > through. > > To fix this, rather than immediately sinking DBG_VALUEs, record them in > a pass structure. When sinking is complete and instructions won't be > sunk any further, new DBG_VALUEs are added, avoiding lots of > intermediate DBG_VALUE $noregs being created. > > Differential revision: https://reviews.llvm.org/D70676
* [TypePromotion] Query target register widthSam Parker2019-12-101-2/+13
| | | | | | | | | | | | TargetLoweringInfo may report that an integer should be promoted, but it maybe provide a size that isn't natively supported by the target register file... So check this before trying to perform a promotion. This is to fix some chromium issues: https://bugs.chromium.org/p/chromium/issues/detail?id=1031978 https://bugs.chromium.org/p/chromium/issues/detail?id=1031979 Differential Revision: https://reviews.llvm.org/D71200
* [AArch64] Fix issues with large arrays on stackKiran Chandramohan2019-12-102-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496
* [LegalizeTypes] Bugfixes for big-endian targets when handling BITCASTsMikael Holmen2019-12-102-7/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This fixes PR44135. The special case when we promote a bitcast from a vector to an int needs special handling when we are on a big-endian target. Prior to this fix, for the added vec_to_int we see the following in the SelectionDAG printouts Type-legalized selection DAG: %bb.1 'foo:bb.1' SelectionDAG has 9 nodes: t0: ch = EntryToken t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0 t17: v4i32 = bitcast t2 t23: i32 = extract_vector_elt t17, Constant:i32<3> t8: ch,glue = CopyToReg t0, Register:i32 $r0, t23 t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1 and I think here the extract_vector_elt is wrong and extracts the value from the wrong index. The program program should return the 32 bits made up of the elements at index 4 and 5 in the vec6 array, but with t23: i32 = extract_vector_elt t17, Constant:i32<3> as far as I can tell, we will extract values that originally didn't even exist in the vec6 vectore. If we would instead extract the element at index 2 we would get the wanted values. With this fix we insert a right shift after the bitcast in DAGTypeLegalizer::PromoteIntRes_BITCAST which then gives us Type-legalized selection DAG: %bb.1 'vec_to_int:bb.1' SelectionDAG has 9 nodes: t0: ch = EntryToken t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0 t23: v4i32 = bitcast t2 t27: i32 = extract_vector_elt t23, Constant:i32<2> t8: ch,glue = CopyToReg t0, Register:i32 $r0, t27 t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1 So now we get t27: i32 = extract_vector_elt t23, Constant:i32<2> which is what we want. Similarly, the new int_to_vec testcase exposes a bug where we cast the other direction. Then we instead need to add a left shift before the bitcast on big-endian targets for the bits in the input integer to end up at the exptected place in the vector. Reviewers: bogner, spatel, craig.topper, t.p.northover, dmgreen, efriedma, SjoerdMeijer, samparker Reviewed By: efriedma Subscribers: eli.friedman, bjope, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70942
* [NFCi][llvm][MIRVRegNamerUtils] Making some code cleanup and stylistic changes.Puyan Lotfi2019-12-092-74/+44
| | | | | | | | | | | | | | | | | | Making some changes to MIRVRegNamerUtils.cpp to use some more modern c++ features as well as some changes to generally make the code more concise and more understandable. I make this an NFCi because in one case I drop the whole "if (!MO->isDef()) MO->setIsKill(false);" thing that was added in the original implementation, generally because I don't think this is really semantically sound. I also changed up the implementation of VRegRenamer::createVirtualRegisterWithLowerName somewhat because I am now lower-casing the name unconditionally because I confirmed that that was in fact aditya_nandakumar@apple.com's intent. In all other cases, behavior should not be changed. Differential Revision: https://reviews.llvm.org/D71182
* [MC] Delete MCCodePadderFangrui Song2019-12-091-24/+1
| | | | | | | | | | | | | | | | | | | | | D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106
* [NFC][MacroFusion] Adding the assertion if someone want to fuse more than 2 ↵QingShan Zhang2019-12-101-0/+8
| | | | | | | | | | instructions As discussed in https://reviews.llvm.org/D69998, we miss to create some dependency edges if chained more than 2 instructions. Adding an assertion here if someone want to chain more than 2 instructions. Differential Revision: https://reviews.llvm.org/D71180
* [PGO][PGSO] Instrument the code gen / target passes.Hiroshi Yamauchi2019-12-0911-57/+184
| | | | | | | | | | | | | | | | | | | | Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149
* [ModuloSchedule] Fix data types in ModuloScheduleExpander::isLoopCarriedThomas Raoux2019-12-091-2/+2
| | | | | | | | | The cycle values in modulo scheduling results can be negative. The result of ModuloSchedule::getCycle() must be received as an int type. Patch by Masaki Arai! Differential Revision: https://reviews.llvm.org/D71122
* [DebugInfo] Nerf placeDbgValues, with prejudiceJeremy Morse2019-12-091-27/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453
* [DebugInfo] Make describeLoadedValue() reg awareDavid Stenberg2019-12-092-38/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431
* Revert "[DebugInfo] Make describeLoadedValue() reg aware"David Stenberg2019-12-092-54/+38
| | | | | This reverts commit 3cd93a4efcdeabeb20cb7bec9fbddcb540d337a1. I'll recommit with a well-formatted arcanist commit message.
* [DebugInfo] Make describeLoadedValue() reg awareDavid Stenberg2019-12-092-38/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.
* Revert 393dacacf7e7 "[ARM] Enable TypePromotion by default"Hans Wennborg2019-12-091-21/+3
| | | | | | | | | | | | This caused "Too many bits for uint64_t" asserts when building Chromium. See https://crbug.com/1031978#c2 for a reproducer. I'll follow up on the llvm-commits thread with a creduced version. > ARMCodeGenPrepare has already been generalized and renamed to > TypePromotion. We've had it enabled and tested downstream for a > while, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D70998
OpenPOWER on IntegriCloud