summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [CodeGen] [SelectionDAG] More efficient code for X % C == 0 (UREM case) (try 2)Roman Lebedev2019-06-271-0/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: I'm submitting a new revision since i don't understand how to reclaim/reopen/take over the existing one, D50222. There is no such action in "Add Action" menu... Original patch D50222 by @hermord (Dmytro Shynkevych) This implements an optimization described in Hacker's Delight 10-17: when `C` is constant, the result of `X % C == 0` can be computed more cheaply without actually calculating the remainder. The motivation is discussed here: https://bugs.llvm.org/show_bug.cgi?id=35479. Original patch author: @hermord (Dmytro Shynkevych)! Notes: - In principle, it's possible to also handle the `X % C1 == C2` case, as discussed on bugzilla. This seems to require an extra branch on overflow, so I refrained from implementing this for now. - An explicit check for when the `REM` can be reduced to just its LHS is included: the `X % C` == 0 optimization breaks `test1` in `test/CodeGen/X86/jump_sign.ll` otherwise. I hadn't managed to find a better way to not generate worse output in this case. - The `test/CodeGen/X86/jump_sign.ll` regresses, and is being fixed by a followup patch D63390. Reviewers: RKSimon, craig.topper, spatel, hermord, xbolva00 Reviewed By: RKSimon, xbolva00 Subscribers: xbolva00, javed.absar, llvm-commits, hermord Tags: #llvm Differential Revision: https://reviews.llvm.org/D63391 llvm-svn: 364563
* [X86] getTargetVShiftByConstNode - reduce variable scope. NFCI.Simon Pilgrim2019-06-271-8/+7
| | | | | | Fixes cppcheck warning. llvm-svn: 364561
* [ARM] Fix formatting issue in ARMISelLowering.cppSam Tebbs2019-06-271-1/+2
| | | | | | | Fix a formatting error in ARMISelLowering.cpp::Expand64BitShift. My test commit after receiving write access. llvm-svn: 364560
* Recommit [PowerPC] Update P9 vector costs for insert/extract elementRoland Froese2019-06-271-0/+29
| | | | | | Recommit patch D60160 after regression fix patch D63463. llvm-svn: 364557
* [Attr] Add "willreturn" function attributeJohannes Doerfert2019-06-278-0/+13
| | | | | | | | | | | | | | | | | | | | | | This patch introduces a new function attribute, willreturn, to indicate that a call of this function will either exhibit undefined behavior or comes back and continues execution at a point in the existing call stack that includes the current invocation. This attribute guarantees that the function does not have any endless loops, endless recursion, or terminating functions like abort or exit. Patch by Hideto Ueno (@uenoku) Reviewers: jdoerfert Subscribers: mehdi_amini, hiraditya, steven_wu, dexonsmith, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62801 llvm-svn: 364555
* [LiveDebugValues] Emit the debug entry valuesDjordje Todorovic2019-06-271-20/+139
| | | | | | | | | | | | | | | | Emit replacements for clobbered parameters location if the parameter has unmodified value throughout the funciton. This is basic scenario where we can use the debug entry values. ([12/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D58042 llvm-svn: 364553
* Bitcode: derive all types used from records instead of Values.Tim Northover2019-06-273-144/+311
| | | | | | | | | | | | | | | There is existing bitcode that we need to support where the structured nature of pointer types is used to derive the result type of some operation. For example a GEP's operation and result will be based on its input Type. When pointers become opaque, the BitcodeReader will still have access to this information because it's explicitly told how to construct the more complex types used, but this information will not be attached to any Value that gets looked up. This changes BitcodeReader so that in all places which use type information in this manner, it's derived from a side-table rather than from the Value in question. llvm-svn: 364550
* [LiveRangeEdit] Fix build failure caused by the rL364536Djordje Todorovic2019-06-271-1/+1
| | | | llvm-svn: 364549
* [TargetLowering] SimplifyDemandedVectorElts - add shift/rotate support.Simon Pilgrim2019-06-271-0/+18
| | | | llvm-svn: 364548
* [PowerPC][HTM] Fix disassembling buffer overflow for tabortdc and othersJinsong Ji2019-06-275-33/+28
| | | | | | | | | | | | | | | | | This was reported in https://bugs.llvm.org/show_bug.cgi?id=41751 llvm-mc aborted when disassembling tabortdc. This patch try to clean up TM related DAGs. * Fixes the problem by remove explicit output of cr0, and put it as implicit def. * Update int_ppc_tbegin pattern to accommodate the implicit def of cr0. * Update the TCHECK operand and int_ppc_tcheck accordingly. * Add some builtin test and disassembly tests. * Remove unused CRRC0/crrc0 Differential Revision: https://reviews.llvm.org/D61935 llvm-svn: 364544
* Revert r363658 "[SVE][IR] Scalable Vector IR Type with pr42210 fix"Hans Wennborg2019-06-279-68/+13
| | | | | | | | | | | | | | | | | | | | | | | | We saw a 70% ThinLTO link time increase in Chromium for Android, see crbug.com/978817. Sounds like more of PR42210. > Recommit of D32530 with a few small changes: > - Stopped recursively walking through aggregates in > the verifier, so that we don't impose too much > overhead on large modules under LTO (see PR42210). > - Changed tests to match; the errors are slightly > different since they only report the array or > struct that actually contains a scalable vector, > rather than all aggregates which contain one in > a nested member. > - Corrected an older comment > > Reviewers: thakis, rengolin, sdesmalen > > Reviewed By: sdesmalen > > Differential Revision: https://reviews.llvm.org/D63321 llvm-svn: 364543
* [DWARF] Handle the DW_OP_entry_value operandDjordje Todorovic2019-06-278-8/+91
| | | | | | | | | | | | | | | Add the IR and the AsmPrinter parts for handling of the DW_OP_entry_values DWARF operation. ([11/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60866 llvm-svn: 364542
* [TargetLowering] SimplifyDemandedBits - use DemandedElts to better identify ↵Simon Pilgrim2019-06-271-11/+21
| | | | | | partial splat shift amounts llvm-svn: 364541
* [mips] Mark pseudo select instructions by the `hasNoSchedulingInfo` tagSimon Atanasyan2019-06-271-2/+2
| | | | llvm-svn: 364540
* [mips] Add new items to the list of features unsupported by P5600Simon Atanasyan2019-06-271-3/+3
| | | | llvm-svn: 364539
* [Backend] Keep call site info valid through the backendDjordje Todorovic2019-06-278-6/+48
| | | | | | | | | | | | | | | | | | | | Handle call instruction replacements and deletions in order to preserve valid state of the call site info of the MachineFunction. NOTE: If the call site info is enabled for a new target, the assertion from the MachineFunction::DeleteMachineInstr() should help to locate places where the updateCallSiteInfo() should be called in order to preserve valid state of the call site info. ([10/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D61062 llvm-svn: 364536
* [ARM] Fix bogus assertions in copyPhysReg v8.1-M cases.Simon Tatham2019-06-271-4/+4
| | | | | | | | | | | | | | | | The code to generate register move instructions in and out of VPR and FPSCR_NZCV had assertions checking that the other register involved was a GPR _pair_, instead of a single GPR as it should have been. Reviewers: miyuki, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63865 llvm-svn: 364534
* [ARM] Fix handling of zero offsets in LOB instructions.Simon Tatham2019-06-272-16/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BF and WLS/WLSTP instructions have various branch-offset fields occupying different positions and lengths in the instruction encoding, and all of them were decoded at disassembly time by the function DecodeBFLabelOffset() which returned SoftFail if the offset was zero. In fact, it's perfectly fine and not even a SoftFail for most of those offset fields to be zero. The only one that can't be zero is the 4-bit field labelled `boff` in the architecture spec, occupying bits {26-23} of the BF instruction family. If that one is zero, the encoding overlaps other instructions (WLS, DLS, LETP, VCTP), so it ought to be a full Fail. Fixed by adding an extra template parameter to DecodeBFLabelOffset which controls whether a zero offset is accepted or rejected. Adjusted existing tests (only in error messages for bad disassemblies); added extra tests to demonstrate zero offsets being accepted in all the right places, and a few demonstrating rejection of zero `boff`. Reviewers: DavidSpickett, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63864 llvm-svn: 364533
* [ARM] Make coprocessor number restrictions consistent.Simon Tatham2019-06-273-10/+25
| | | | | | | | | | | | | | | | | | | | | | Different versions of the Arm architecture disallow the use of generic coprocessor instructions like MCR and CDP on different sets of coprocessors. This commit centralises the check of the coprocessor number so that it's consistent between assembly and disassembly, and also updates it for the new restrictions in Arm v8.1-M. New tests added that check all the coprocessor numbers; old tests updated, where they used a number that's now become illegal in the context in question. Reviewers: DavidSpickett, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63863 llvm-svn: 364532
* [ARM] Tighten restrictions on use of SP in v8.1-M CSEL.Simon Tatham2019-06-273-8/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | In the `CSEL Rd,Rm,Rn` instruction family (also including CSINC, CSINV and CSNEG), the architecture lists it as CONSTRAINED UNPREDICTABLE (i.e. SoftFail) to use SP in the Rd or Rm slot, but outright illegal to use it in the Rn slot, not least because some encodings of that form are used by MVE instructions such as UQRSHLL. MC was treating all three slots the same, as SoftFail. So the only reason UQRSHLL was disassembled correctly at all was because the MVE decode table is separate from the Thumb2 one and takes priority; if you turned off MVE, then encodings such as `[0x5f,0xea,0x0d,0x83]` would disassemble as spurious CSELs. Fixed by inventing another version of the `GPRwithZR` register class, which disallows SP completely instead of just SoftFailing it. Reviewers: DavidSpickett, ostannard Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63862 llvm-svn: 364531
* [X86] getFauxShuffle - add DemandedElts as a filterSimon Pilgrim2019-06-271-8/+17
| | | | | | This is currently benign but will be used in the future based on the elements referenced by the parent shuffle(s). llvm-svn: 364530
* [mips] Add GPR_64 predicate to some mov[zn] instructionsSimon Atanasyan2019-06-271-8/+10
| | | | llvm-svn: 364527
* [mips] Fix indentation and split long lines. NFCSimon Atanasyan2019-06-271-5/+5
| | | | llvm-svn: 364526
* [mips] Reformat MSA instruction definitions. NFCSimon Atanasyan2019-06-271-46/+32
| | | | llvm-svn: 364525
* IR: compare type attributes deeply when looking into functions.Tim Northover2019-06-271-0/+13
| | | | | | | | | FunctionComparator attempts to produce a stable comparison of two Function instances by looking at all available properties. Since ByVal attributes now contain a Type pointer, they are not trivially ordered and FunctionComparator should use its own Type comparison logic to sort them. llvm-svn: 364523
* [Attributor] Deducing existing nounwind attribute.Stefan Stipanovic2019-06-271-10/+85
| | | | | | | | | | | | Adding nounwind deduction in new attributor framework. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63379 llvm-svn: 364521
* [X86][AVX] SimplifyDemandedVectorElts - combine PERMPD(x) -> EXTRACTF128(X) Simon Pilgrim2019-06-271-0/+16
| | | | | | If we only use the bottom lane, see if we can simplify this to extract_subvector - which is always at least as quick as PERMPD/PERMQ. llvm-svn: 364518
* [yaml2obj] - Allow overriding e_shentsize, e_shoff, e_shnum and e_shstrndx ↵George Rimar2019-06-271-0/+5
| | | | | | | | | | | | fields in the YAML. This allows setting different values for e_shentsize, e_shoff, e_shnum and e_shstrndx fields and is useful for producing broken inputs for various test cases. Differential revision: https://reviews.llvm.org/D63771 llvm-svn: 364517
* [ISEL][X86] Tracking of registers that forward call argumentsDjordje Todorovic2019-06-272-3/+17
| | | | | | | | | | | | | | | | | While lowering calls, collect info about registers that forward arguments into following function frame. We store such info into the MachineFunction of the call. This is used very late when dumping DWARF info about call site parameters. ([9/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60715 llvm-svn: 364516
* [DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locationsJeremy Morse2019-06-271-2/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Once MIR code leaves SSA form and the liveness of a vreg is considered, DBG_VALUE insts are able to refer to non-live vregs, because their debug-uses do not contribute to liveness. This non-liveness becomes problematic for optimizations like register coalescing, as they can't ``see'' the debug uses in the liveness analyses. As a result registers get coalesced regardless of debug uses, and that can lead to invalid variable locations containing unexpected values. In the added test case, the first vreg operand of ADD32rr is merged with various copies of the vreg (great for performance), but a DBG_VALUE of the unmodified operand is blindly updated to the modified operand. This changes what value the variable will appear to have in a debugger. Fix this by changing any DBG_VALUE whose operand will be resurrected by register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no location. This is an overapproximation as some coalesced locations are safe (others are not) -- an extra domination analysis would be required to work out which, and it would be better if we just don't generate non-live DBG_VALUEs. This fixes PR40010. Differential Revision: https://reviews.llvm.org/D56151 llvm-svn: 364515
* [GlobalISel] Remove [un]packRegs from IRTranslatorDiana Picus2019-06-271-29/+4
| | | | | | | | | | | Remove the last use of packRegs from IRTranslator and delete pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics with aggregate arguments, since we don't have a testcase for them so it's hard to tell how we'd want to handle them. Discussed in https://reviews.llvm.org/D63551 llvm-svn: 364514
* [AArch64 GlobalISel] Cleanup CallLowering. NFCIDiana Picus2019-06-272-49/+12
| | | | | | | | | Now that lowerCall and lowerFormalArgs have been refactored, we can simplify splitToValueTypes. Differential Revision: https://reviews.llvm.org/D63552 llvm-svn: 364513
* [GlobalISel] Accept multiple vregs for lowerCall's argsDiana Picus2019-06-275-17/+13
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512
* [GlobalISel] Accept multiple vregs for lowerCall's resultDiana Picus2019-06-277-48/+21
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511
* [GlobalISel] Accept multiple vregs in lowerFormalArgsDiana Picus2019-06-2712-77/+144
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
* [GlobalISel] Allow multiple VRegs in ArgInfo. NFCDiana Picus2019-06-276-29/+54
| | | | | | | | | | | Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509
* [AMDGPU] Fix +DumpCode to print an entry label for the first functionJay Foad2019-06-271-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. It tries to print an entry label at the start of every function, but that didn't work for the first function in the module because DumpCodeInstEmitter wasn't initialised until EmitFunctionBodyStart which is too late. Change-Id: I790d73ddf4f51fd02ab32529380c7cb7c607c4ee Reviewers: arsenm, tpr, kzhuravl Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63712 llvm-svn: 364508
* Silence gcc warning after r364458Mikael Holmen2019-06-271-1/+1
| | | | | | | | | | | | Without the fix gcc 7.4.0 complains with ../lib/Target/X86/X86ISelLowering.cpp: In function 'bool getFauxShuffleMask(llvm::SDValue, llvm::SmallVectorImpl<int>&, llvm::SmallVectorImpl<llvm::SDValue>&, llvm::SelectionDAG&)': ../lib/Target/X86/X86ISelLowering.cpp:6690:36: error: enumeral and non-enumeral type in conditional expression [-Werror=extra] int Idx = (ZeroMask[j] ? SM_SentinelZero : (i + j + Ofs)); ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1plus: all warnings being treated as errors llvm-svn: 364507
* [MachineFunction] Base support for call site info trackingDjordje Todorovic2019-06-273-0/+88
| | | | | | | | | | | | | | Add an attribute into the MachineFunction that tracks call site info. ([8/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D61061 llvm-svn: 364506
* Fix -Wunused-variable warnings after r364464Hans Wennborg2019-06-271-2/+2
| | | | | | | | | | | | | | | | | /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp: In function ‘llvm::Expected<std::basic_string<char> > readIdentificationBlock(llvm::BitstreamCursor&)’: /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:205:22: warning: unused variable ‘BitCode’ [-Wunused-variable] switch (unsigned BitCode = MaybeBitCode.get()) { ^ /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp: In member function ‘llvm::Error {anonymous}::ModuleSummaryIndexBitcodeReader::parseModule()’: /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:5367:26: warning: unused variable ‘BitCode’ [-Wunused-variable] switch (unsigned BitCode = MaybeBitCode.get()) { ^ llvm-svn: 364505
* [IR] Add DISuprogram and DIE for a func declDjordje Todorovic2019-06-272-2/+10
| | | | | | | | | | | | | | | A unique DISubprogram may be attached to a function declaration used for call site debug info. ([6/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60713 llvm-svn: 364500
* [X86] Remove (vzext_movl (scalar_to_vector (load))) matching code from ↵Craig Topper2019-06-271-17/+0
| | | | | | | | selectScalarSSELoad. I think this will be turning into vzext_load during DAG combine. llvm-svn: 364499
* [X86] Teach selectScalarSSELoad to not narrow volatile loads.Craig Topper2019-06-271-5/+7
| | | | llvm-svn: 364498
* [NFC][PowerPC] Improve the for loop in Early ReturnKang Zhang2019-06-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In `PPCEarlyReturn.cpp` ``` 183 for (MachineFunction::iterator I = MF.begin(); I != MF.end();) { 184 MachineBasicBlock &B = *I++; 185 if (processBlock(B)) 186 Changed = true; 187 } ``` Above code can be improved to: ``` 184 for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E;) { 185 MachineBasicBlock &B = *I++; 186 Changed |= processBlock(B); 187 } ``` Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D63800 llvm-svn: 364496
* [ARM] Don't reserve R12 on Thumb1 as an emergency spill slot.Eli Friedman2019-06-267-120/+217
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1. (Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.) A rough outline of the changes in the patch: 1. Gets rid of ThumbRegisterInfo::saveScavengerRegister. 2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1. 3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1. 4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot. 5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible. 6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer. 7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward. 8. Changes Thumb1 prologue and epilogue emission so it never uses register scavenging. Some of the changes to the emergency spill slot heuristics in determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1. Differential Revision: https://reviews.llvm.org/D63677 llvm-svn: 364490
* [SCCP] Fix non-deterministic uselists of return values (DenseMap -> MapVector)Gerolf Hoflehner2019-06-261-6/+7
| | | | llvm-svn: 364482
* [SLP] Look-ahead operand reordering heuristic.Vasileios Porpodas2019-06-261-46/+236
| | | | | | | | | | | | | | | | Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364478
* AMDGPU: Assert SPAdj is 0Matt Arsenault2019-06-261-0/+2
| | | | llvm-svn: 364473
* PEI: Add default handling of spills to registersMatt Arsenault2019-06-261-9/+22
| | | | llvm-svn: 364472
* [AMDGPU] Fix Livereg computation during epilogue insertionMatt Arsenault2019-06-261-1/+2
| | | | | | | | | | | | The LivePhysRegs calculated in order to find a scratch register in the epilogue code wrongly uses 'LiveIns'. Instead, it should use the 'Liveout' sets. For the liveness, also considering the operands of the terminator (return) instruction which is the insertion point for the scratch-exec-copy instruction. Patch by Christudasan Devadasan llvm-svn: 364470
OpenPOWER on IntegriCloud