summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* RegPressure: Fix crash on blocks with only dbg_valueMatt Arsenault2019-03-272-1/+142
| | | | | | | | If there were only dbg_values in the block, recede would hit the beginning of the block and try to use thet dbg_value as a real instruction. llvm-svn: 357105
* Fix and speedup __libcpp_locale_guard on WindowsThomas Anderson2019-03-271-19/+33
| | | | | | | | | | | | | | The old implementation assumed the POSIX `setlocale()` API where the old locale is returned. On Windows, the _new_ locale is returned. This meant that `__libcpp_locale_guard` wasn't resetting the locale on destruction. The new implementation fixes the above issue and takes advantage of `setlocale(LC_ALL)` to reduce the number of calls, and also avoids setting the locale at all if it's not necessary. Differential Revision: https://reviews.llvm.org/D59572 llvm-svn: 357104
* [InstCombine] Use uadd.sat and usub.sat for canonicalizationNikita Popov2019-03-273-154/+93
| | | | | | | | | | | | | Start using the uadd.sat and usub.sat intrinsics for the existing canonicalizations. These intrinsics should optimize better than expanded IR, have better handling in the X86 backend and should be no worse than expanded IR in other backends, as far as we know. rL357012 already introduced use of uadd.sat for the add+umin pattern. Differential Revision: https://reviews.llvm.org/D58872 llvm-svn: 357103
* [clangd] Support utf-8 offsets (rather than utf-16) as a protocol extensionSam McCall2019-03-2711-27/+228
| | | | | | | | | | | | | | | | Summary: Still some pieces to go here: unit tests for new SourceCode functionality and a command-line flag to force utf-8 mode. But wanted to get early feedback. Reviewers: hokein Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, kadircet, jdoerfert, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D58275 llvm-svn: 357102
* [GlobalISel] Fix legalizer artifact combiner from crashing with invalid dead ↵Amara Emerson2019-03-277-26/+113
| | | | | | | | | | | | | | | | | | | | instructions. The artifact combiners push instructions which have been marked for deletion onto an list for the legalizer to deal with on return. However, for trunc(ext) combines the combiner routine recursively calls itself. When it does this the dead instructions list may not be empty, and the other combiners don't expect to be dealing with essentially invalid MIR (multiple vreg defs etc). This change fixes it by ensuring that the dead instructions are processed on entry into tryCombineInstruction. As a result, this fix exposed a few places in tests where G_TRUNC instructions were not being deleted even though they were dead. Differential Revision: https://reviews.llvm.org/D59892 llvm-svn: 357101
* [ASTImporter] Fix IsStructuralMatch specialization for EnumDecl to prevent ↵Shafik Yaghmour2019-03-271-0/+6
| | | | | | | | | | | re-importing an EnumDecl while trying to complete it Summary: We may try and re-import an EnumDecl while trying to complete it in IsStructuralMatch(...) specialization for EnumDecl. This change mirrors a similar fix for the specialization for RecordDecl. Differential Revision: https://reviews.llvm.org/D59845 llvm-svn: 357100
* [X86MacroFusion][NFC] Add a bulldozer test.Clement Courbet2019-03-271-1/+23
| | | | llvm-svn: 357099
* Reapply "AMDGPU: Scavenge register instead of findUnusedReg"Matt Arsenault2019-03-272-1/+45
| | | | | | | | | | | | | This reapplies r356149, using the correct overload of findUnusedReg which passes the current iterator. This worked most of the time, because the scavenger iterator was moved at the end of the frame index loop in PEI. This would fail if the spill was the first instruction. This was further hidden by the fact that the scavenger wasn't passed in for normal frame index elimination. llvm-svn: 357098
* AMDGPU: Add testcase I meant to merge into r357093Matt Arsenault2019-03-271-0/+37
| | | | llvm-svn: 357097
* [X86] Add post-isel pseudos for rotate by immediate using SHLD/SHRDCraig Topper2019-03-274-26/+52
| | | | | | | | | | | | | | Haswell CPUs have special support for SHLD/SHRD with the same register for both sources. Such an instruction will go to the rotate/shift unit on port 0 or 6. This gives it 1 cycle latency and 0.5 cycle reciprocal throughput. When the register is not the same, it becomes a 3 cycle operation on port 1. Sandybridge and Ivybridge always have 1 cyc latency and 0.5 cycle reciprocal throughput for any SHLD. When FastSHLDRotate feature flag is set, we try to use SHLD for rotate by immediate unless BMI2 is enabled. But MachineCopyPropagation can look through a copy and change one of the sources to be different. This will break the hardware optimization. This patch adds psuedo instruction to hide the second source input until after register allocation and MachineCopyPropagation. I'm not sure if this is the best way to do this or if there's some other way we can make this work. Fixes PR41055 Differential Revision: https://reviews.llvm.org/D59391 llvm-svn: 357096
* [PeepholeOpt] Don't stop simplifying copies on sequence of subregsQuentin Colombet2019-03-272-6/+35
| | | | | | | | | | | | | | | This patch removes an overly conservative check that would prevent simplifying copies when the value we were tracking would go through several subregister indices. Indeed, the intend of this check was to not track values whenever we have to compose subregister, but actually what the check was doing was bailing anytime we see a second subreg, even if that second subreg would actually be the new source of truth (as opposed to a part of that subreg). Differential Revision: https://reviews.llvm.org/D59891 llvm-svn: 357095
* [AArch64][SVE] Asm: error on unexpected SVE vector register type suffixSander de Smalen2019-03-274-3/+37
| | | | | | | | | | | | | | | | | | | | | | This patch fixes an assembler bug that allowed SVE vector registers to contain a type suffix when not expected. The SVE unpredicated movprfx instruction is the only instruction affected. The following are examples of what was previously valid: movprfx z0.b, z0.b movprfx z0.b, z0.s movprfx z0, z0.s These instructions are now erroneous. Patch by Cullen Rhodes (c-rhodes) Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D59636 llvm-svn: 357094
* AMDGPU: Enable the scavenger for large framesMatt Arsenault2019-03-272-19/+24
| | | | | | | Another test is needed for the case where the scavenge fail, but there's another issue with that which needs an additional fix. llvm-svn: 357093
* Fix occurrences of _LIBCPP_ASSERT in span testsCasey Carter2019-03-273-8/+8
| | | | llvm-svn: 357092
* AMDGPU: Add additional MIR tests for exec mask optimizationsMatt Arsenault2019-03-274-10/+737
| | | | | | | | | | Also includes one example of how this transform is unsound. This isn't verifying the copies are used in the control flow intrinisic patterns. Also add option to disable exec mask opt pass. Since this pass is unsound, it may be useful to turn it off until it is fixed. llvm-svn: 357091
* AMDGPU: Skip debug_instr when collapsing end_cfMatt Arsenault2019-03-272-3/+118
| | | | | | | Based on how these are inserted, I doubt this was causing a problem in practice. llvm-svn: 357090
* AMDGPU: Fix missing scc implicit def on s_andn2_b64_termMatt Arsenault2019-03-275-39/+34
| | | | | | | Introduce new helper class to copy properties directly from the base instruction. llvm-svn: 357089
* New methods to check for under-/overflow in the SMT APIMikhail R. Gadelha2019-03-272-0/+112
| | | | | | | | | | | | | | | | Summary: Added methods to check for under-/overflow in additions, subtractions, signed divisions/modulus, negations, and multiplications. Reviewers: ddcc, gou4shi1 Reviewed By: ddcc, gou4shi1 Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59796 llvm-svn: 357088
* PEI: Delay checking requiresFrameIndexReplacementScavengingMatt Arsenault2019-03-271-4/+10
| | | | | | | | | | Currently this is called before the frame size is set on the function. For AMDGPU, the scavenger is used for large frames where part of the offset needs to be materialized in a register, so estimating the frame size is useful for knowing whether the scavenger is useful. llvm-svn: 357087
* [Platform] Remove Kalimba PlatformJonas Devlieghere2019-03-2711-380/+10
| | | | | | | | | This patch removes the Kalimba platform. For more information please refer to the corresponding thread on the mailing list. http://lists.llvm.org/pipermail/lldb-dev/2019-March/014921.html llvm-svn: 357086
* [MCA] Fix -Wparentheses warning breaking the -Werror build.Andrea Di Biagio2019-03-271-1/+2
| | | | | | Waring was introduced at r357074. llvm-svn: 357085
* AMDGPU: Don't hardcode num defs for MUBUF instructionsMatt Arsenault2019-03-271-2/+2
| | | | | | | This shouldn't change anything since the no-ret atomics are selected later. llvm-svn: 357084
* MIR: Freeze reserved regs after parsing everythingMatt Arsenault2019-03-276-10/+57
| | | | | | | | | | | | The AMDGPU implementation of getReservedRegs depends on MachineFunctionInfo fields that are parsed from the YAML section. This was reserving the wrong register since it was setting the reserved regs before parsing the correct one. Some tests were relying on the default reserved set for the assumed default calling convention. llvm-svn: 357083
* [clangd] Bump vscode-clangd v0.0.12.Haojian Wu2019-03-271-1/+1
| | | | | | | | | CHANGELOG: - add an explicit command to activate the extension. - support .cu files (the extension is not activated for .cu files by default, you need to manually activate the extension). llvm-svn: 357082
* AMDGPU: wave_barrier is not isBarrierMatt Arsenault2019-03-272-1/+12
| | | | | | | This is not a control flow instruction, so should not be marked as isBarrier. This fixes a verifier error if followed by unreachable. llvm-svn: 357081
* Rename some variables in the std-module testsPavel Labath2019-03-272-2/+2
| | | | | | | They cause failures on some systems due to an unrelated bug (pr35043). This works around that. llvm-svn: 357080
* [libc++] Add proper XFAILs for shared_mutex testsLouis Dionne2019-03-2715-2/+17
| | | | | | | | | | Dylib support for shared_mutex was added in macOS 10.12, so the tests should be XFAILed accordingly instead of being completely disabled whenever availability is enabled. rdar://problem/48769104 llvm-svn: 357079
* [clangd] Fix the inconsistent code indent in vscode extension, NFC.Haojian Wu2019-03-271-21/+21
| | | | llvm-svn: 357078
* [BPF] use std::map to ensure consistent outputYonghong Song2019-03-272-42/+69
| | | | | | | | | | | | | | | | | | | | | | | | | The .BTF.ext FuncInfoTable and LineInfoTable contain information organized per ELF section. Current definition of FuncInfoTable/LineInfoTable is: std::unordered_map<uint32_t, std::vector<BTFFuncInfo>> FuncInfoTable std::unordered_map<uint32_t, std::vector<BTFLineInfo>> LineInfoTable where the key is the section name off in the string table. The unordered_map may cause the order of section output different for different platforms. The same for unordered map definition of std::unordered_map<std::string, std::unique_ptr<BTFKindDataSec>> DataSecEntries where BTF_KIND_DATASEC entries may have different ordering for different platforms. This patch fixed the issue by using std::map. Test static-var-derived-type.ll is modified to generate two DataSec's which will ensure the ordering is the same for all supported platforms. Signed-off-by: Yonghong Song <yhs@fb.com> llvm-svn: 357077
* [X86MacroFusion][NFC] Improve macrofusion testing.Clement Courbet2019-03-271-11/+135
| | | | | | | Add negative tests. Add arithmetic/inc/cmp/and macrofusion tests. llvm-svn: 357076
* [clangd] Add activate command to the vscode extension.Haojian Wu2019-03-272-1/+10
| | | | | | | | | | | | | | | | Summary: This would help minizime the annoying part of not activating the extension for .cu file. Reviewers: ilya-biryukov Subscribers: ioeric, MaskRay, jkorous, arphaman, kadircet, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D59817 llvm-svn: 357075
* [MCA][Pipeline] Don't visit stages in reverse order when calling method ↵Andrea Di Biagio2019-03-271-3/+3
| | | | | | | | | | cycleEnd(). NFCI There is no reason why stages should be visited in reverse order. This patch allows the definition of stages that push instructions forward from their cycleEnd() routine. llvm-svn: 357074
* AMDGPU: Fix areLoadsFromSameBasePtr for DS atomicsMatt Arsenault2019-03-272-4/+28
| | | | | | The offset operand index is different for atomics. llvm-svn: 357073
* [LLD] Restore tests that use "-" as outputAndrew Ng2019-03-275-38/+19
| | | | | | | | | | No longer require workarounds for output to "-" (stdout) for Windows. These workarounds were just hiding the actual problem which has been fixed in r357058. Differential Revision: https://reviews.llvm.org/D59824 llvm-svn: 357072
* gn build: Merge r357047Nico Weber2019-03-271-0/+1
| | | | llvm-svn: 357071
* [DAGCombiner] Unify Lifetime and memory Op aliasing.Nirav Dave2019-03-273-89/+133
| | | | | | | | | | | | | | | | | | | Rework BaseIndexOffset and isAlias to fully work with lifetime nodes and fold in lifetime alias analysis. This is mostly NFC. Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59794 llvm-svn: 357070
* [DAGCombine] Refactor GatherAllAliases. NFCI.Nirav Dave2019-03-271-65/+66
| | | | llvm-svn: 357069
* [OPENMP]Initial support for 'allocate' clause.Alexey Bataev2019-03-27197-282/+737
| | | | | | Added parsing/sema analysis of the allocate clause. llvm-svn: 357068
* Re-commit r355490 "[CodeGen] Omit range checks from jump tables when ↵Hans Wennborg2019-03-275-72/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | lowering switches with unreachable default" Original commit by Ayonam Ray. This commit adds a regression test for the issue discovered in the previous commit: that the range check for the jump table can only be omitted if the fall-through destination of the jump table is unreachable, which isn't necessarily true just because the default of the switch is unreachable. This addresses the missing optimization in PR41242. > During the lowering of a switch that would result in the generation of a > jump table, a range check is performed before indexing into the jump > table, for the switch value being outside the jump table range and a > conditional branch is inserted to jump to the default block. In case the > default block is unreachable, this conditional jump can be omitted. This > patch implements omitting this conditional branch for unreachable > defaults. > > Differential Revision: https://reviews.llvm.org/D52002 > Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 357067
* Revert of 357063 [AMDGPU][MC] Corrected handling of tied src for atomic ↵Dmitry Preobrazhensky2019-03-272-15/+7
| | | | | | | return MUBUF opcodes Reason: the change was mistakenly committed before review llvm-svn: 357066
* The IR verifier currently supports the constrained floating point intrinsics,Kevin P. Neal2019-03-271-11/+62
| | | | | | | | | | | | | | but the implementation is hard to extend. It doesn't currently have an easy way to support intrinsics that, for example, lack a rounding mode. This will be needed for impending new constrained intrinsics. This code is split out of D55897 <https://reviews.llvm.org/D55897>, which itself was split out of D43515 <https://reviews.llvm.org/D43515>. Reviewed by: arsenm Differential Revision: http://reviews.llvm.org/D59830 llvm-svn: 357065
* [AArch64] NFC: Cleanup isAArch64FrameOffsetLegalSander de Smalen2019-03-272-202/+109
| | | | | | | | | | | | | | | | Cleanup isAArch64FrameOffsetLegal by: - Merging the large switch statement to reuse AArch64InstrInfo::getMemOpInfo(). - Using AArch64InstrInfo::getUnscaledLdSt() to determine whether an instruction has an unscaled variant. - Simplifying the logic that calculates the offset to fit the immediate. Reviewers: paquette, evandro, eli.friedman, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D59636 llvm-svn: 357064
* [AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodesDmitry Preobrazhensky2019-03-272-7/+15
| | | | | | | | | | See bug 40917: https://bugs.llvm.org/show_bug.cgi?id=40917 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D59305 llvm-svn: 357063
* [X86][SSE] Add shuffle test case for PR41249Simon Pilgrim2019-03-271-0/+37
| | | | llvm-svn: 357062
* Revert the r348352 "[clang] - Simplify tools::SplitDebugName."George Rimar2019-03-275-13/+21
| | | | | | | | | This partially reverts the r348352 (https://reviews.llvm.org/D55006) because of https://bugs.llvm.org/show_bug.cgi?id=41161. I did not revert the test case file because it passes fine now. llvm-svn: 357061
* minidump: Add ability to attach (breakpad) symbol files to placeholder modulesPavel Labath2019-03-275-52/+115
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-commits r354263, which was because it uncovered with handling of modules with empty (zero) UUIDs. This would cause us to treat two modules as intentical even though they were not. This caused an assert in PlaceholderObjectFile::SetLoadAddress to fire, because we were trying to load the module twice even though it was designed to be only loaded at a specific address. (The same problem also existed with the previous implementation, but it had no asserts to warn us about this.) These issues have now been fixed in r356896. windows bot. The issue there was that ObjectFilePECOFF vended its base address through the incorrect interface. SymbolFilePDB depended on that, which lead to assertion failures when SymbolFilePDB was attempting to use the placeholder object files as a base. This has been fixed in r354258 The original commit message was: The reason this wasn't working was that ProcessMinidump was creating odd object-file-less modules, and SymbolFileBreakpad required the module to have an associated object file because it needed to get its base address. This fixes that by introducing a PlaceholderObjectFile to serve as a dummy object file. The general idea for this is taken from D55142, but I've reworked it a bit to avoid the need for the PlaceholderModule class. Now that we have an object file, our modules are sufficiently similar to regular modules that we can use the regular Module class almost out of the box -- the only thing I needed to tweak was the Module::CreateModuleFromObjectFile functon to set the module's FileSpec in addition to it's architecture. This wasn't needed for ObjectFileJIT (the other user of CreateModuleFromObjectFile), but it shouldn't hurt it either, and the change seems like a straightforward extension of this function. Reviewers: clayborg, lemo, amccarth Subscribers: lldb-commits Differential Revision: https://reviews.llvm.org/D57751 llvm-svn: 357060
* [AArch64] Adds cases for LDRSHWui and LDRSHXui to getMemOpInfoSander de Smalen2019-03-271-0/+6
| | | | | | | This patch also adds cases PRFUMi and PRFMui. This change was discussed in https://reviews.llvm.org/D59635. llvm-svn: 357059
* [Support] MemoryBlock size should reflect the requested sizeAndrew Ng2019-03-271-3/+4
| | | | | | | | | | This patch mirrors the change made to the Unix equivalent in r351916. This in turn fixes bugs related to the use of FileOutputBuffer to output to "-", i.e. stdout, on Windows. Differential Revision: https://reviews.llvm.org/D59663 llvm-svn: 357058
* Revert rL356864 : [X86][SSE41] Start shuffle combining from ↵Simon Pilgrim2019-03-2714-497/+527
| | | | | | | | | | | | ZERO_EXTEND_VECTOR_INREG (PR40685) Enable SSE41 ZERO_EXTEND_VECTOR_INREG shuffle combines - for the PMOVZX(PSHUFD(V)) -> UNPCKH(V,0) pattern we reduce the shuffles (port5-bottleneck on Intel) at the expense of creating a zero (pxor v,v) and an extra register move - which is a good trade off as these are pretty cheap and in most cases it doesn't increase register pressure. This also exposed a missed opportunity to use combine to ZERO_EXTEND_VECTOR_INREG with folded loads - even if we're in the float domain. ........ Causes PR41249 llvm-svn: 357057
* Fix a "memset clearing an object of non-trivial type" warning in DWARFFormValuePavel Labath2019-03-271-1/+1
| | | | | | | | This is diagnosed by gcc-8. The ValueType struct already has a default constructor which performs zero-initialization, so we can just call that instead of using memset. llvm-svn: 357056
OpenPOWER on IntegriCloud