summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* [Attributor] Deducing existing nounwind attribute.Stefan Stipanovic2019-06-273-10/+201
| | | | | | | | | | | | Adding nounwind deduction in new attributor framework. Reviewers: jdoerfert, uenoku Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D63379 llvm-svn: 364521
* [X86][SSE] Regenerate v48 shuffle test on a variety of targetsSimon Pilgrim2019-06-271-16/+88
| | | | llvm-svn: 364520
* [clangd] Address limitations in SelectionTree:Sam McCall2019-06-272-51/+164
| | | | | | | | | | | | | | | | | | Summary: - nodes can have special-cased hit ranges including "holes" (FunctionTypeLoc in void foo()) - token conflicts between siblings (int a,b;) are resolved in favor of left sibling - parent/child overlap is handled statefully rather than explicitly by comparing parent/child ranges (this lets us share a mechanism with sibling conflicts) Reviewers: kadircet Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D63760 llvm-svn: 364519
* [X86][AVX] SimplifyDemandedVectorElts - combine PERMPD(x) -> EXTRACTF128(X) Simon Pilgrim2019-06-275-14/+31
| | | | | | If we only use the bottom lane, see if we can simplify this to extract_subvector - which is always at least as quick as PERMPD/PERMQ. llvm-svn: 364518
* [yaml2obj] - Allow overriding e_shentsize, e_shoff, e_shnum and e_shstrndx ↵George Rimar2019-06-275-5/+84
| | | | | | | | | | | | fields in the YAML. This allows setting different values for e_shentsize, e_shoff, e_shnum and e_shstrndx fields and is useful for producing broken inputs for various test cases. Differential revision: https://reviews.llvm.org/D63771 llvm-svn: 364517
* [ISEL][X86] Tracking of registers that forward call argumentsDjordje Todorovic2019-06-274-3/+76
| | | | | | | | | | | | | | | | | While lowering calls, collect info about registers that forward arguments into following function frame. We store such info into the MachineFunction of the call. This is used very late when dumping DWARF info about call site parameters. ([9/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60715 llvm-svn: 364516
* [DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locationsJeremy Morse2019-06-272-2/+328
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Once MIR code leaves SSA form and the liveness of a vreg is considered, DBG_VALUE insts are able to refer to non-live vregs, because their debug-uses do not contribute to liveness. This non-liveness becomes problematic for optimizations like register coalescing, as they can't ``see'' the debug uses in the liveness analyses. As a result registers get coalesced regardless of debug uses, and that can lead to invalid variable locations containing unexpected values. In the added test case, the first vreg operand of ADD32rr is merged with various copies of the vreg (great for performance), but a DBG_VALUE of the unmodified operand is blindly updated to the modified operand. This changes what value the variable will appear to have in a debugger. Fix this by changing any DBG_VALUE whose operand will be resurrected by register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no location. This is an overapproximation as some coalesced locations are safe (others are not) -- an extra domination analysis would be required to work out which, and it would be better if we just don't generate non-live DBG_VALUEs. This fixes PR40010. Differential Revision: https://reviews.llvm.org/D56151 llvm-svn: 364515
* [GlobalISel] Remove [un]packRegs from IRTranslatorDiana Picus2019-06-272-37/+4
| | | | | | | | | | | Remove the last use of packRegs from IRTranslator and delete pack/unpackRegs. This introduces a fallback to DAGISel for intrinsics with aggregate arguments, since we don't have a testcase for them so it's hard to tell how we'd want to handle them. Discussed in https://reviews.llvm.org/D63551 llvm-svn: 364514
* [AArch64 GlobalISel] Cleanup CallLowering. NFCIDiana Picus2019-06-272-49/+12
| | | | | | | | | Now that lowerCall and lowerFormalArgs have been refactored, we can simplify splitToValueTypes. Differential Revision: https://reviews.llvm.org/D63552 llvm-svn: 364513
* [GlobalISel] Accept multiple vregs for lowerCall's argsDiana Picus2019-06-2710-100/+29
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for each argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63551 llvm-svn: 364512
* [GlobalISel] Accept multiple vregs for lowerCall's resultDiana Picus2019-06-2710-87/+41
| | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerCall to accept several virtual registers for the call result, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660 and lowerFormalArguments in D63549. With this change, we no longer pack the virtual registers generated for aggregates into one big lump before delegating to the target. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. NFCI for AMDGPU, Mips and X86. Differential Revision: https://reviews.llvm.org/D63550 llvm-svn: 364511
* [GlobalISel] Accept multiple vregs in lowerFormalArgsDiana Picus2019-06-2717-155/+214
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the interface of CallLowering::lowerFormalArguments to accept several virtual registers for each formal argument, instead of just one. This is a follow-up to D46018. CallLowering::lowerReturn was similarly refactored in D49660. lowerCall will be refactored in the same way in follow-up patches. With this change, we forward the virtual registers generated for aggregates to CallLowering. Therefore, the target can decide itself whether it wants to handle them as separate pieces or use one big register. We also copy the pack/unpackRegs helpers to CallLowering to facilitate this. ARM and AArch64 have been updated to use the passed in virtual registers directly, which means we no longer need to generate so many merge/extract instructions. AArch64 seems to have had a bug when lowering e.g. [1 x i8*], which was put into a s64 instead of a p0. Added a test-case which illustrates the problem more clearly (it crashes without this patch) and fixed the existing test-case to expect p0. AMDGPU has been updated to unpack into the virtual registers for kernels. I think the other code paths fall back for aggregates, so this should be NFC. Mips doesn't support aggregates yet, so it's also NFC. x86 seems to have code for dealing with aggregates, but I couldn't find the tests for it, so I just added a fallback to DAGISel if we get more than one virtual register for an argument. Differential Revision: https://reviews.llvm.org/D63549 llvm-svn: 364510
* [GlobalISel] Allow multiple VRegs in ArgInfo. NFCDiana Picus2019-06-277-34/+62
| | | | | | | | | | | Allow CallLowering::ArgInfo to contain more than one virtual register. This is useful when passes split aggregates into several virtual registers, but need to also provide information about the original type to the call lowering. Used in follow-up patches. Differential Revision: https://reviews.llvm.org/D63548 llvm-svn: 364509
* [AMDGPU] Fix +DumpCode to print an entry label for the first functionJay Foad2019-06-272-12/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. It tries to print an entry label at the start of every function, but that didn't work for the first function in the module because DumpCodeInstEmitter wasn't initialised until EmitFunctionBodyStart which is too late. Change-Id: I790d73ddf4f51fd02ab32529380c7cb7c607c4ee Reviewers: arsenm, tpr, kzhuravl Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63712 llvm-svn: 364508
* Silence gcc warning after r364458Mikael Holmen2019-06-271-1/+1
| | | | | | | | | | | | Without the fix gcc 7.4.0 complains with ../lib/Target/X86/X86ISelLowering.cpp: In function 'bool getFauxShuffleMask(llvm::SDValue, llvm::SmallVectorImpl<int>&, llvm::SmallVectorImpl<llvm::SDValue>&, llvm::SelectionDAG&)': ../lib/Target/X86/X86ISelLowering.cpp:6690:36: error: enumeral and non-enumeral type in conditional expression [-Werror=extra] int Idx = (ZeroMask[j] ? SM_SentinelZero : (i + j + Ofs)); ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1plus: all warnings being treated as errors llvm-svn: 364507
* [MachineFunction] Base support for call site info trackingDjordje Todorovic2019-06-2710-0/+281
| | | | | | | | | | | | | | Add an attribute into the MachineFunction that tracks call site info. ([8/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D61061 llvm-svn: 364506
* Fix -Wunused-variable warnings after r364464Hans Wennborg2019-06-271-2/+2
| | | | | | | | | | | | | | | | | /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp: In function ‘llvm::Expected<std::basic_string<char> > readIdentificationBlock(llvm::BitstreamCursor&)’: /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:205:22: warning: unused variable ‘BitCode’ [-Wunused-variable] switch (unsigned BitCode = MaybeBitCode.get()) { ^ /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp: In member function ‘llvm::Error {anonymous}::ModuleSummaryIndexBitcodeReader::parseModule()’: /work/llvm.monorepo/llvm/lib/Bitcode/Reader/BitcodeReader.cpp:5367:26: warning: unused variable ‘BitCode’ [-Wunused-variable] switch (unsigned BitCode = MaybeBitCode.get()) { ^ llvm-svn: 364505
* Fix GCC 4 build after r364464Hans Wennborg2019-06-271-1/+1
| | | | | | | | | | | | | | | | | | It was failing with: In file included from /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/lib/Bitcode/Reader/BitstreamReader.cpp:9:0: /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/Bitcode/BitstreamReader.h: In member function 'llvm::Expected<long unsigned int> llvm::SimpleBitstreamCursor::ReadVBR64(unsigned int)': /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/Bitcode/BitstreamReader.h:262:14: error: could not convert 'MaybeRead' from 'llvm::Expected<unsigned int>' to 'llvm::Expected<long unsigned int>' return MaybeRead; ^ /b/s/w/ir/cache/builder/src/third_party/llvm/llvm/include/llvm/Bitcode/BitstreamReader.h:279:16: error: could not convert 'MaybeRead' from 'llvm::Expected<unsigned int>' to 'llvm::Expected<long unsigned int>' return MaybeRead; ^ llvm-svn: 364504
* [lldb] [Plugins/SysV-x86_64] NetBSD is also using SysV ABIMichal Gorny2019-06-271-0/+1
| | | | | | | | | | | Reenable SysV x86_64 ABI usage on NetBSD that was accidentally removed in r364216. This fixes numerous test failures with messages similar to the following: error: Can't run the expression locally: Interpreter doesn't handle one of the expression's opcodes llvm-svn: 364503
* [clang] Add DISuprogram and DIE for a func declDjordje Todorovic2019-06-275-6/+73
| | | | | | | | | | | | | | | Attach a unique DISubprogram to a function declaration that will be used for call site debug info. ([7/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60714 llvm-svn: 364502
* gn build: Follow-up to r364491 "[GN] Update build files"Nico Weber2019-06-275-4/+19
| | | | | | | | | | | - Merge r364427 (GSYM lib) more: It was missing the new unit test (as pointed out by llvm/utils/gn/build/sync_source_lists_from_cmake.py), and it had some superfluous deps not present in the cmake build. - Merge r364474 (clang DependencyScanning lib) more: The deps didn't quite match cmake. llvm-svn: 364501
* [IR] Add DISuprogram and DIE for a func declDjordje Todorovic2019-06-274-8/+23
| | | | | | | | | | | | | | | A unique DISubprogram may be attached to a function declaration used for call site debug info. ([6/13] Introduce the debug entry values.) Co-authored-by: Ananth Sowda <asowda@cisco.com> Co-authored-by: Nikola Prica <nikola.prica@rt-rk.com> Co-authored-by: Ivan Baev <ibaev@cisco.com> Differential Revision: https://reviews.llvm.org/D60713 llvm-svn: 364500
* [X86] Remove (vzext_movl (scalar_to_vector (load))) matching code from ↵Craig Topper2019-06-271-17/+0
| | | | | | | | selectScalarSSELoad. I think this will be turning into vzext_load during DAG combine. llvm-svn: 364499
* [X86] Teach selectScalarSSELoad to not narrow volatile loads.Craig Topper2019-06-272-5/+41
| | | | llvm-svn: 364498
* [InstCombine][NFCI] Fix test comments.Huihui Zhang2019-06-272-4/+4
| | | | | | | | | | | | For fold (X & (signbit l>> Y)) ==/!= 0 -> (X << Y) >=/< 0 (X & (signbit << Y)) ==/!= 0 -> (X l>> Y) >=/< 0 Test cases of X being constant are positive tests not negative. Prep work for D62818. llvm-svn: 364497
* [NFC][PowerPC] Improve the for loop in Early ReturnKang Zhang2019-06-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In `PPCEarlyReturn.cpp` ``` 183 for (MachineFunction::iterator I = MF.begin(); I != MF.end();) { 184 MachineBasicBlock &B = *I++; 185 if (processBlock(B)) 186 Changed = true; 187 } ``` Above code can be improved to: ``` 184 for (MachineFunction::iterator I = MF.begin(), E = MF.end(); I != E;) { 185 MachineBasicBlock &B = *I++; 186 Changed |= processBlock(B); 187 } ``` Reviewed By: hfinkel Differential Revision: https://reviews.llvm.org/D63800 llvm-svn: 364496
* [NFC] Return early for types with size zeroVitaly Buka2019-06-271-3/+4
| | | | llvm-svn: 364495
* [Reproducers] Fix flakiness and off-by-one during replay.Jonas Devlieghere2019-06-271-6/+25
| | | | | | | | | | | | | | | | | | | | | | | This fixes two replay issues that caused the tests to behave erratically: 1. It fixes an off-by-one error, where all replies where shifted by 1 because of a `+` packet that should've been ignored. 2. It fixes another off-by-one-error, where an asynchronous ^C was offsetting all subsequent packets. The reason is that we 'synchronize' requests and replies. In reality, a stop reply is only sent when the process halt. During replay however, we instantly report the stop, as the reply to packets like continue (vCont). Both packets should be ignored, and indeed, checking the gdb-remote log, no unexpected packets are received anymore. Additionally, be more pedantic when it comes to unexpected packets and return an failure form the replay server. This way we should be able to catch these things faster in the future. llvm-svn: 364494
* [GN] Fix check-llvmVitaly Buka2019-06-271-0/+14
| | | | llvm-svn: 364493
* [NFC] Remove unneeded local variablesVitaly Buka2019-06-271-9/+5
| | | | llvm-svn: 364492
* [GN] Update build filesVitaly Buka2019-06-277-0/+25
| | | | llvm-svn: 364491
* [ARM] Don't reserve R12 on Thumb1 as an emergency spill slot.Eli Friedman2019-06-2613-211/+626
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of ThumbRegisterInfo::saveScavengerRegister is bad for two reasons: one, it's buggy, and two, it blocks using R12 for other optimizations. So this patch gets rid of it, and adds the necessary support for using an ordinary emergency spill slot on Thumb1. (Specifically, I think saveScavengerRegister was broken by r305625, and nobody noticed for two years because the codepath is almost never used. The new code will also probably not be used much, but it now has better tests, and if we fail to emit a necessary emergency spill slot we get a reasonable error message instead of a miscompile.) A rough outline of the changes in the patch: 1. Gets rid of ThumbRegisterInfo::saveScavengerRegister. 2. Modifies ARMFrameLowering::determineCalleeSaves to allocate an emergency spill slot for Thumb1. 3. Implements useFPForScavengingIndex, so the emergency spill slot isn't placed at a negative offset from FP on Thumb1. 4. Modifies the heuristics for allocating an emergency spill slot to support Thumb1. This includes fixing ExtraCSSpill so we don't try to use "lr" as a substitute for allocating an emergency spill slot. 5. Allocates a base pointer in more cases, so the emergency spill slot is always accessible. 6. Modifies ARMFrameLowering::ResolveFrameIndexReference to compute the right offset in the new cases where we're forcing a base pointer. 7. Ensures we never generate a load or store with an offset outside of its frame object. This makes the heuristics more straightforward. 8. Changes Thumb1 prologue and epilogue emission so it never uses register scavenging. Some of the changes to the emergency spill slot heuristics in determineCalleeSaves affect ARM/Thumb2; hopefully, they should allow the compiler to avoid allocating an emergency spill slot in cases where it isn't necessary. The rest of the changes should only affect Thumb1. Differential Revision: https://reviews.llvm.org/D63677 llvm-svn: 364490
* [ObjC] Improve error message for a malformed objc-type-nameErik Pilkington2019-06-263-2/+5
| | | | | | | | | | | | If the type didn't exist, we used to emit a really bad error: t.m:3:12: error: expected ')' -(nullable NoSuchType)foo3; ^ rdar://50925632 llvm-svn: 364489
* Fix Bitcode/invalid.testJF Bastien2019-06-261-6/+6
| | | | | | On the armv8 bot the failure is slightly different in the number it prints. Don't check the numbers. This was caused by r364464. llvm-svn: 364488
* [GWP-ASan] D63736 broke ARMv7/v8 sanitizer bots.Mitch Phillips2019-06-261-1/+1
| | | | | | | | Remove ARM32/ARM64 support for GWP-ASan due to a strange SEGV when running scudo's preinit.c test. Disabling to make the bots go green while investigating. llvm-svn: 364486
* [cmake] Allow config.guess to be run with MSYS on WindowsPengxuan Zheng2019-06-261-1/+1
| | | | | | | | | | | | | | | | | | Summary: With r363420, config.guess can no longer be run with MSYS on Windows and this patch should be able to fix this particular case. Reviewers: compnerd Reviewed By: compnerd Subscribers: mgorny, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D63834 llvm-svn: 364485
* Support nested target.xml register definition files, lack of reg group markers.Jason Molenda2019-06-263-60/+326
| | | | | | | | | | | | | | | | | | | | | The qemu x86_64 target returns a target.xml register definition file which includes other xml files and they include others, etc. Also, the registers are not put in register groups like lldb wants to see. This patch (1) puts registers that aren't in a register group in a "general" register group, (2) change ProcessGDBRemote::GetGDBServerRegisterInfo to be a method that starts the parsing, asking a recurisve function to fetch and parse target.xml, (3) adds ProcessGDBRemote::GetGDBServerRegisterInfoXMLAndProcess which can recusively call itself to read and parse included xml files, (4) in addition to expecting the top-level <target> element (which only happens in the top level xml file), also an xml file that consists of a <feature> node - read the register defintions and includes from that <feature> element. <rdar://problem/49537922> Differential revision: https://reviews.llvm.org/D63802 llvm-svn: 364484
* [SCCP] Fix non-deterministic uselists of return values (DenseMap -> MapVector)Gerolf Hoflehner2019-06-261-6/+7
| | | | llvm-svn: 364482
* Use the // integer divide operator in these Jason Molenda2019-06-262-2/+2
| | | | | | target definition files, like Davide's change to x86_64_target_definition.py. llvm-svn: 364481
* Fix formatting after r364479Aaron Puchert2019-06-261-4/+2
| | | | | | | The reflowing obscurs the functional changes, so here is a separate commit. llvm-svn: 364480
* [Clang] Remove unused -split-dwarf and obsolete -enable-split-dwarfAaron Puchert2019-06-2611-71/+17
| | | | | | | | | | | | | | | | | | | | | | Summary: The changes in D59673 made the choice redundant, since we can achieve single-file split DWARF just by not setting an output file name. Like llc we can also derive whether to enable Split DWARF from whether -split-dwarf-file is set, so we don't need the flag at all anymore. The test CodeGen/split-debug-filename.c distinguished between having set or not set -enable-split-dwarf with -split-dwarf-file, but we can probably just always emit the metadata into the IR. The flag -split-dwarf wasn't used at all anymore. Reviewers: dblaikie, echristo Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D63167 llvm-svn: 364479
* [SLP] Look-ahead operand reordering heuristic.Vasileios Porpodas2019-06-262-93/+323
| | | | | | | | | | | | | | | | Summary: This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example). Reviewers: RKSimon, ABataev, dtemirbulatov, Ayal, hfinkel, rnk Reviewed By: RKSimon, dtemirbulatov Subscribers: rnk, rcorcs, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60897 llvm-svn: 364478
* [InstCombine] change 'tmp' variable names; NFCSanjay Patel2019-06-261-86/+86
| | | | | | | | | | | I don't think there was anything going wrong here, but the auto-generating CHECK line script is known to have problems with 'TMP' because it uses that to match nameless values. This is a retry of rL364452. llvm-svn: 364477
* Revert r363191 "[MS] Pretend constexpr variable template specializations are ↵Reid Kleckner2019-06-262-30/+4
| | | | | | | | | inline" The next Visual Studio update will fix this issue, and it doesn't make sense to implement this non-conforming behavior going forward. llvm-svn: 364476
* [clang-scan-deps] Introduce the DependencyScanning library with theAlex Lorenz2019-06-2610-116/+286
| | | | | | | | | | | | | thread worker code and better error handling This commit extracts out the code that will powers the fast scanning worker into a new file in a new DependencyScanning library. The error and output handling is improved so that the clients can gather errors/results from the worker directly. Differential Revision: https://reviews.llvm.org/D63681 llvm-svn: 364474
* AMDGPU: Assert SPAdj is 0Matt Arsenault2019-06-261-0/+2
| | | | llvm-svn: 364473
* PEI: Add default handling of spills to registersMatt Arsenault2019-06-261-9/+22
| | | | llvm-svn: 364472
* [UpdateTestChecks][NFC] Remove entries with same prefixJinsong Ji2019-06-261-2/+0
| | | | | | | | Matching is 'lossy', triples with same prefix can be dropped. Differential Revision: https://reviews.llvm.org/D63732 llvm-svn: 364471
* [AMDGPU] Fix Livereg computation during epilogue insertionMatt Arsenault2019-06-262-1/+3
| | | | | | | | | | | | The LivePhysRegs calculated in order to find a scratch register in the epilogue code wrongly uses 'LiveIns'. Instead, it should use the 'Liveout' sets. For the liveness, also considering the operands of the terminator (return) instruction which is the insertion point for the scratch-exec-copy instruction. Patch by Christudasan Devadasan llvm-svn: 364470
* [X86] Rework the logic in LowerBuildVectorv16i8 to make better use of ↵Craig Topper2019-06-265-52/+49
| | | | | | | | | | | | any_extend and break false dependencies. Other improvements This patch rewrites the loop iteration to only visit every other element starting with element 0. And we work on the "even" element and "next" element at the same time. The "First" logic has been moved to the bottom of the loop and doesn't run on every element. I believe it could create dangling nodes previously since we didn't check if we were going to use SCALAR_TO_VECTOR for the first insertion. I got rid of the "First" variable and just do a null check on V which should be equivalent. We also no longer use undef as the starting V for vectors with no zeroes to avoid false dependencies. This matches v8i16. I've changed all the extends and OR operations to use MVT::i32 since that's what they'll be promoted to anyway. I've tried to use zero_extend only when necessary and use any_extend otherwise. This resulted in some improvements in tests where we are now able to promote aligned (i32 (extload i8)) to a 32-bit load. Differential Revision: https://reviews.llvm.org/D63702 llvm-svn: 364469
OpenPOWER on IntegriCloud