summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
* [WebAssembly] Remove trailing whitespaces in tests (NFC)Heejin Ahn2019-03-061-1/+1
| | | | | | | | | | | | Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58955 llvm-svn: 355472
* Revert "[AtomicExpand] Allow libcall expansion for non-zero address spaces" ↵Mitch Phillips2019-03-061-8/+2
| | | | | | for buildbot failures. llvm-svn: 355461
* [ARM] Sink zext/sext operands for add and sub to enable vsubl generation.Florian Hahn2019-03-062-0/+45
| | | | | | | | | | | | | | | This uses the infrastructure added in rL353152 to sink zext and sexts to sub/add users, to enable vsubl/vaddl generation when NEON is available. See https://bugs.llvm.org/show_bug.cgi?id=40025. Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma Reviewed By: samparker Differential Revision: https://reviews.llvm.org/D58063 llvm-svn: 355460
* [DWARFFormValue] Don't consider DW_FORM_data4/8 to be section offsets.Jonas Devlieghere2019-03-051-7/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | When dumping ToT clan's debug info with dwarfdump, we were seeing an error saying that that the location list overflows the debug_loc section. After reducing the testcase we figured out that we were interpreting the DW_FORM_data4 as a section offset. In DWARF3 DW_FORM_data4 and DW_FORM_data8 served also as a section offset. Until now we didn't check check for the DWARF version, because some producers (read old versions of clang) were still emitting this. The relevant code/comment was added in 2013, and I believe it's now reasonable to start checking the version. The FormValue class is a little bit of a mess because it cashes the DWARF unit and context when it extracted the value itself. Several methods of the class rely on it being present, or return an Optional for the code path that needs it. At the same time the FormValue class also used in places where there's no DWARF unit. For this patch I went with the least invasive change: checking the version from the CU when it's available. If it's not (because the form value was created from a value directly) we default to the old behavior. Differential revision: https://reviews.llvm.org/D58698 llvm-svn: 355456
* [AtomicExpand] Allow libcall expansion for non-zero address spacesPhilip Reames2019-03-051-2/+8
| | | | | | | | Be consistent about how we treat atomics in non-zero address spaces. If we get to the backend, we tend to lower them as if in address space 0. Do the same if we need to insert a libcall instead. Differential Revision: https://reviews.llvm.org/D58760 llvm-svn: 355453
* [WebAssembly] Simplify iterator navigations (NFC)Heejin Ahn2019-03-055-21/+15
| | | | | | | | | | | | | | | | | | | | Summary: - Replaces some uses of `MachineFunction::iterator(MBB)` with `MBB->getIterator()` and `MachineBasicBlock::iterator(MI)` with `MI->getIterator()`, which are simpler. - Replaces some uses of `std::prev` of `std::next` that takes a MachineFunction or MachineBasicBlock iterator with `getPrevNode` and `getNextNode`, which are also simpler. Reviewers: sbc100 Subscribers: dschuff, sunfish, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58913 llvm-svn: 355444
* [Remarks][NFC] Rename RemarkParser to YAMLRemarkParserFrancis Visoiu Mistrih2019-03-051-19/+20
| | | | | | Rename it to reflect that it's parsing YAML remarks. llvm-svn: 355441
* [OptRemarks] Make OptRemarks more generic: rename OptRemarks to RemarksFrancis Visoiu Mistrih2019-03-056-38/+37
| | | | | | | | | | | | | | | Getting rid of the name "optimization remarks" for anything that involves handling remarks on the client side. It's safer to do this now, before we get stuck with that name in all the APIs and public interfaces we decide to export to users in the future. This renames llvm/tools/opt-remarks to llvm/tools/remarks-shlib, and now generates `libRemarks.dylib` instead of `libOptRemarks.dylib`. Differential Revision: https://reviews.llvm.org/D58535 llvm-svn: 355439
* [WebAssembly] Disable MachineBlockPlacement passHeejin Ahn2019-03-051-0/+4
| | | | | | | | | | | | | | | | | Summary: This pass hurts code size for wasm and sometimes generates irreducible control flow. Context: https://github.com/emscripten-core/emscripten/pull/8233 Reviewers: kripken, dschuff Subscribers: sunfish, sbc100, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58953 llvm-svn: 355437
* Revert r355224 "[TableGen][SelectionDAG][X86] Add specific isel matchers for ↵Craig Topper2019-03-055-59/+59
| | | | | | | | immAllZerosV/immAllOnesV. Remove bitcasts from X86 patterns that are no longer necessary." This caused the first matcher in the isel table for many targets to Opc_Scope instead of Opc_SwitchOpcode. This leads to a significant increase in isel match failures. llvm-svn: 355433
* [Subtarget] Merge ProcSched and ProcDesc arrays in MCSubtargetInfo into a ↵Craig Topper2019-03-052-14/+10
| | | | | | | | | | | | single array. These arrays are both keyed by CPU name and go into the same tablegenerated file. Merge them so we only need to store keys once. This also removes a weird space saving quirk where we used the ProcDesc.size() to create to build an ArrayRef for ProcSched. Differential Revision: https://reviews.llvm.org/D58939 llvm-svn: 355431
* [X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdgesGuozhi Wei2019-03-051-0/+1
| | | | | | | | | | The variable X86DomainReassignment::EnclosedEdges is used to store registers that have been enclosed in some closure, so those registers will be ignored when create new closures. But there is no registers has ever been put into this set, so a single register can be enclosed in multiple closures, it significantly increase compile time. This patch adds a register into EnclosedEdges when it is enclosed into a closure. Differential Revision: https://reviews.llvm.org/D58646 llvm-svn: 355430
* [Subtarget] Create a separate SubtargetSubtargetKV struct for ProcDesc to ↵Craig Topper2019-03-052-9/+11
| | | | | | | | | | | | remove fields from the stack tables that aren't needed for CPUs The description for CPUs was just the CPU name wrapped with "Select the " and " processor". We can just do that directly in the help printer instead of making a separate version in the binary for each CPU. Also remove the Value field that isn't needed and was always 0. Differential Revision: https://reviews.llvm.org/D58938 llvm-svn: 355429
* [Subtarget] Move SubtargetFeatureKV/SubtargetInfoKV from SubtargetFeature.h ↵Craig Topper2019-03-052-193/+160
| | | | | | | | | | | | | | | | to MCSubtargetInfo.h. Move all code that operates on ProcFeatures and ProcDesc arrays to MCSubtargetInfo. The SubtargetFeature class managed a list of features as strings. And it also had functions for setting bits in a FeatureBitset. The methods that operated on the Feature list as strings are used in other parts of the backend. But the parts that operate on FeatureBitset are very tightly coupled to MCSubtargetInfo and requires passing in the arrays that MCSubtargetInfo owns. And the same struct type is used for ProcFeatures and ProcDesc. This has led to MCSubtargetInfo having 2 arrays keyed by CPU name. One containing a mapping from a CPU name to its features. And one containing a mapping from CPU name to its scheduler model. I would like to make a single CPU array containing all CPU information and remove some unneeded fields the ProcDesc array currently has. But I don't want to make SubtargetFeatures.h have to know about the scheduler model type and have to forward declare or pull in the header file. Differential Revision: https://reviews.llvm.org/D58937 llvm-svn: 355428
* AMDGPU: Preserve undef flag when expanding SI_IFMatt Arsenault2019-03-051-2/+2
| | | | | | Fixes undefined value verifier error. llvm-svn: 355426
* [X86] Enable 8-bit SHL to convert to LEACraig Topper2019-03-052-1/+5
| | | | | | Differential Revision: https://reviews.llvm.org/D58870 llvm-svn: 355425
* [X86] Allow 8-bit INC/DEC to be converted to LEA.Craig Topper2019-03-052-6/+9
| | | | | | | | We already do this for 16/32/64 as well as 8-bit add with register/immediate. Might as well do it for 8-bit INC/DEC too. Differential Revision: https://reviews.llvm.org/D58869 llvm-svn: 355424
* [X86] Enable 8-bit OR with disjoint bits to convert to LEACraig Topper2019-03-056-10/+33
| | | | | | | | We already support 8-bits adds in convertToThreeAddress. But we can also support 8-bit OR if the bits are disjoint. We already do this for 16/32/64. Differential Revision: https://reviews.llvm.org/D58863 llvm-svn: 355423
* TableGen: Allow lists to be concatenated through '#'Javed Absar2019-03-052-4/+37
| | | | | | | | | | | | | | | | | | | Currently one can concatenate strings using hash(#), but not lists, although that would be a natural thing to do. This patch allows one to write something like: def : A<!listconcat([1,2], [3,4])>; simply as : def : A<[1,2] # [3,4]>; This was missing feature was highlighted by Nicolai at FOSDEM talk. Reviewed by: nhaehnle, hfinkel Differential Revision: https://reviews.llvm.org/D58895 llvm-svn: 355414
* [SDAG] move FP constant folding to helper function; NFCSanjay Patel2019-03-051-67/+72
| | | | llvm-svn: 355411
* Revert "[GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELT"Jessica Paquette2019-03-053-135/+18
| | | | | | | | | | | This broke test-suite::aarch64_neon_intrinsics.test Reverting while I look into it. Example failure: http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/17740 llvm-svn: 355408
* [AMDGPU] Fix DPP operand order in atomic optimizerCarl Ritson2019-03-051-1/+1
| | | | | | | | | | | | | | | | | | | Summary: Ensure order of operands in DPP atomic optimizer final WWM step is appropriate for sub instructions. Change-Id: I631d050e1c00a3b4bc7c11a90437064403c4cf30 Reviewers: sheredom, tpr Reviewed By: sheredom Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, t-tye, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58900 llvm-svn: 355394
* [SCEV] Ensure that isHighCostExpansion takes into account what is being dividedDavid Green2019-03-051-2/+5
| | | | | | | | | | | | | A SCEV is not low-cost just because you can divide it by a power of 2. We need to also check what we are dividing to make sure it too is not a high-code expansion. This helps to not expand the exit value of certain loops, helping not to bloat the code. The change in no-iv-rewrite.ll is reverting back to what it was testing before rL194116, and looks a lot like the other tests in replace-loop-exit-folds.ll. Differential Revision: https://reviews.llvm.org/D58435 llvm-svn: 355393
* [WebAssembly] Rename a variable in LateEHPrepare (NFC)Heejin Ahn2019-03-051-2/+2
| | | | llvm-svn: 355387
* [ARM] Fix select_cc lowering for fp16Oliver Stannard2019-03-051-7/+11
| | | | | | | | | | | When lowering a select_cc node where the true and false values are of type f16, we can't use a general conditional move because the FP16 instructions do not support conditional execution. Instead, we must ensure that the condition code is one of the four supported by the VSEL instruction. Differential revision: https://reviews.llvm.org/D58813 llvm-svn: 355385
* [AMDGPU] Omit KILL instructions from hazard recognizerDavid Stuttard2019-03-051-3/+2
| | | | | | | | | | | | | | | | | | Summary: In some cases the KILL was causing a hazard to be introduced as these were scheduled into hazard slots, but don't result in an instruction. KILL shouldn't be considered for hazard recognition. Change-Id: Ib6d2a2160f8c94cd0ce611ab198c7e4f46aeffcf Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58898 llvm-svn: 355384
* [PowerPC] fix killed/dead flag after convert x-form to d-form tranformation.Chen Zheng2019-03-052-21/+159
| | | | | | Differential Revision: https://reviews.llvm.org/D58428 llvm-svn: 355378
* [AMDGPU] Implement AMDGPUMCInstrAnalysisScott Linder2019-03-052-0/+33
| | | | | | | | | Implement MCInstrAnalysis for AMDGPU, with default implementations save for `evaluateBranch`. Differential Revision: https://reviews.llvm.org/D58400 llvm-svn: 355373
* PHI nodes are not `FPMathOperator` sSanjoy Das2019-03-051-2/+1
| | | | | | | | | | | | | | Reviewers: chandlerc, arsenm Reviewed By: arsenm Subscribers: wdng, arsenm, mcrosier, jlebar, bixia, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58887 llvm-svn: 355362
* [X86] Reduce some patterns by using FP instructions for integer types even ↵Craig Topper2019-03-051-61/+9
| | | | | | | | | | | | when AVX2 is available and execution domain fixing will do the right thing We have quite a few cases of using FP instructions for integer operations when only AVX1 is available. Then we switch to integer instructions with AVX2. In a lot of these cases execution domain fixing will take care of turning FP instructions into integer if its profitable. With this patch we just keep on using the FP instructions even with AVX2. I've only handled some cases that don't require messing with patterns that are defined in the instruction definition. Those will require more subtle multiclass work possibly involving null_frag, hasSideEffects = 0, etc. Differential Revision: https://reviews.llvm.org/D58470 llvm-svn: 355361
* [BPF] Do not generate BTF sections unnecessarilyYonghong Song2019-03-051-0/+8
| | | | | | | | | | | | If There is no types/non-empty strings, do not generate .BTF section. If there is no func_info/line_info, do not generate .BTF.ext section. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D58936 llvm-svn: 355360
* [msan] Instrument x86 BMI intrinsics.Evgeniy Stepanov2019-03-041-0/+31
| | | | | | | | | | | | | | | | Summary: They simply shuffle bits. MSan needs to do the same with shadow bits, after making sure that the shuffle mask is fully initialized. Reviewers: pcc, vitalybuka Subscribers: hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D58858 llvm-svn: 355348
* [NFC] Fix PGO link error in shared libs buildJordan Rupprecht2019-03-041-1/+8
| | | | llvm-svn: 355346
* [CodeGenPrepare] avoid crashing on non-canonical/degenerate codeSanjay Patel2019-03-041-0/+5
| | | | | | | | | | | | The test is reduced from an example in the post-commit thread for: rL354746 http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190304/632396.html While we must avoid dying here, the real question should be: Why is non-canonical and/or degenerate code making it to CGP when using the new pass manager? llvm-svn: 355345
* [GlobalISel][AArch64] Add selection support for G_EXTRACT_VECTOR_ELTJessica Paquette2019-03-043-18/+135
| | | | | | | | | | | | | This adds instruction selection support for G_EXTRACT_VECTOR_ELT for cases where the index is defined by a G_CONSTANT. It also factos out the lane copy opcode selection part into its own function, `getLaneCopyOpcode`. This is used by both `selectUnmergeValues` and `selectExtractElt`. Differential Revision: https://reviews.llvm.org/D58469 llvm-svn: 355344
* [GlobalISel][AArch64] Legalize vector G_SELECTJessica Paquette2019-03-041-1/+4
| | | | | | | | Just scalarize it, and add a test showing it works. Differential Revision: https://reviews.llvm.org/D58747 llvm-svn: 355339
* [ConstantHoisting] avoid hang/crash from unreachable blocks (PR40930)Sanjay Patel2019-03-041-1/+6
| | | | | | | | | | | | | | | | | I'm not too familiar with this pass, so there might be a better solution, but this appears to fix the degenerate: PR40930 PR40931 PR40932 PR40934 ...without affecting any real-world code. As we've seen in several other passes, when we have unreachable blocks, they can contain semi-bogus IR and/or cause unexpected conditions. We would not typically expect these patterns to make it this far, but we have to guard against them anyway. llvm-svn: 355337
* [PGO] Context sensitive PGO (part 3)Rong Xu2019-03-043-40/+109
| | | | | | | | Part 3 of CSPGO changes (mostly related to PassMananger). Differential Revision: https://reviews.llvm.org/D54175 llvm-svn: 355330
* Re-commit r355104: "[AArch64][GlobalISel] Add support for 64 bit vector ↵Amara Emerson2019-03-041-36/+153
| | | | | | | | | | | | shuffle using TBL1." The code to materialize a mask from a constant pool load tried to use a 128 bit LDR to load a 64 bit constant pool entry, which was 8 byte aligned. This resulted in a link failure in the NEON tests in the test suite since the LDR address was unaligned. This change fixes that to instead emit a 64 bit LDR if the entry is 64 bit, before converting back to a 128 bit register for the TBL. llvm-svn: 355326
* [MC] Teach ELFObjectWriter that parse-time variables do not appear inNirav Dave2019-03-041-0/+4
| | | | | | symbol table. llvm-svn: 355325
* [DAGCombiner][X86][SystemZ][AArch64] Combine some cases of (bitcast ↵Craig Topper2019-03-041-6/+9
| | | | | | | | | | (build_vector constants)) between legalize types and legalize dag. This patch enables combining integer bitcasts of integer build vectors when the new scalar type is legal. I've avoided floating point because the implementation bitcasts float to int along the way and we would need to check the intermediate types for legality Differential Revision: https://reviews.llvm.org/D58884 llvm-svn: 355324
* [WebAssembly] Add support for data sections in the assembler.Wouter van Oortmerssen2019-03-043-10/+57
| | | | | | | | | | | | | | | | Summary: This is quite minimal so far, introduce them with .section, fill them with .int8 or .asciz, end with .size Reviewers: dschuff, sbc100, aheejin Subscribers: jgravelle-google, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D58660 llvm-svn: 355321
* [AMDGPU][MC] Enable lds_direct operand for v_readfirstlane_b32, ↵Dmitry Preobrazhensky2019-03-047-47/+110
| | | | | | | | | | | | v_readlane_b32 and v_writelane_b32 See bug 40662: https://bugs.llvm.org/show_bug.cgi?id=40662 Reviewers: artem.tamazov, arsenm, rampitec Differential Revision: https://reviews.llvm.org/D58713 llvm-svn: 355312
* [MCA] Highlight kernel bottlenecks in the summary view.Andrea Di Biagio2019-03-043-4/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new flag named -bottleneck-analysis to print out information about throughput bottlenecks. MCA knows how to identify and classify dynamic dispatch stalls. However, it doesn't know how to analyze and highlight kernel bottlenecks. The goal of this patch is to teach MCA how to correlate increases in backend pressure to backend stalls (and therefore, the loss of throughput). From a Scheduler point of view, backend pressure is a function of the scheduler buffer usage (i.e. how the number of uOps in the scheduler buffers changes over time). Backend pressure increases (or decreases) when there is a mismatch between the number of opcodes dispatched, and the number of opcodes issued in the same cycle. Since buffer resources are limited, continuous increases in backend pressure would eventually leads to dispatch stalls. So, there is a strong correlation between dispatch stalls, and how backpressure changed over time. This patch teaches how to identify situations where backend pressure increases due to: - unavailable pipeline resources. - data dependencies. Data dependencies may delay execution of instructions and therefore increase the time that uOps have to spend in the scheduler buffers. That often translates to an increase in backend pressure which may eventually lead to a bottleneck. Contention on pipeline resources may also delay execution of instructions, and lead to a temporary increase in backend pressure. Internally, the Scheduler classifies instructions based on whether register / memory operands are available or not. An instruction is marked as "ready to execute" only if data dependencies are fully resolved. Every cycle, the Scheduler attempts to execute all instructions that are ready to execute. If an instruction cannot execute because of unavailable pipeline resources, then the Scheduler internally updates a BusyResourceUnits mask with the ID of each unavailable resource. ExecuteStage is responsible for tracking changes in backend pressure. If backend pressure increases during a cycle because of contention on pipeline resources, then ExecuteStage sends a "backend pressure" event to the listeners. That event would contain information about instructions delayed by resource pressure, as well as the BusyResourceUnits mask. Note that ExecuteStage also knows how to identify situations where backpressure increased because of delays introduced by data dependencies. The SummaryView observes "backend pressure" events and prints out a "bottleneck report". Example of bottleneck report: ``` Cycles with backend pressure increase [ 99.89% ] Throughput Bottlenecks: Resource Pressure [ 0.00% ] Data Dependencies: [ 99.89% ] - Register Dependencies [ 0.00% ] - Memory Dependencies [ 99.89% ] ``` A bottleneck report is printed out only if increases in backend pressure eventually caused backend stalls. About the time complexity: Time complexity is linear in the number of instructions in the Scheduler::PendingSet. The average slowdown tends to be in the range of ~5-6%. For memory intensive kernels, the slowdown can be significant if flag -noalias=false is specified. In the worst case scenario I have observed a slowdown of ~30% when flag -noalias=false was specified. We can definitely recover part of that slowdown if we optimize class LSUnit (by doing extra bookkeeping to speedup queries). For now, this new analysis is disabled by default, and it can be enabled via flag -bottleneck-analysis. Users of MCA as a library can enable the generation of pressure events through the constructor of ExecuteStage. This patch partially addresses https://bugs.llvm.org/show_bug.cgi?id=37494 Differential Revision: https://reviews.llvm.org/D58728 llvm-svn: 355308
* [X86] Avoid codegen changes when DBG_VALUE appears between lowered selectsJeremy Morse2019-03-041-4/+15
| | | | | | | | | | | | | | | | X86TargetLowering::EmitLoweredSelect presently detects sequences of CMOV pseudo instructions without accounting for debug intrinsics. This leads to different codegen with and without option -g, if a DBG_VALUE instruction lands in the middle of several lowered selects. Work around this by skipping over debug instructions when looking for CMOV sequences, and sinking those debug insts into the EmitLoweredSelect sunk block. This might slightly shift where variables appear in the instruction sequence, but won't re-order assignments. Differential Revision: https://reviews.llvm.org/D58672 llvm-svn: 355307
* [ARM] Fix selection of VLDR.16 instruction with imm offsetOliver Stannard2019-03-041-10/+5
| | | | | | | | | | | | The isScaledConstantInRange function takes upper and lower bounds which are checked after dividing by the scale, so the bounds checks for half, single and double precision should all be the same. Previously, we had wrong bounds checks for half precision, so selected an immediate the instructions can't actually represent. Differential revision: https://reviews.llvm.org/D58822 llvm-svn: 355305
* [AArch64/ARM] Fix two compiler warnings in InstructionSelector, NFCIJonas Hahnfeld2019-03-042-5/+7
| | | | | | | | | | | 1) GCC complains that KnownValid is set but not used. 2) In ARMInstructionSelector::selectGlobal() the code is mixing "enumeral and non-enumeral type in conditional expression". Solve this by casting to unsigned which is the final type anyway. Differential Revision: https://reviews.llvm.org/D58834 llvm-svn: 355304
* [DebugInfo] Construct nested types on behalf of owner CUEugene Leviant2019-03-042-22/+31
| | | | | | Differential revision: https://reviews.llvm.org/D58786 llvm-svn: 355303
* [llvm] [Support] Revert "Reimplement getMainExecutable() using sysctl on NetBSD"Michal Gorny2019-03-041-18/+2
| | | | | | | | | This apparently does not work reliably after all (non-reentrant?) and causes test failures such as: http://lab.llvm.org:8011/builders/netbsd-amd64/builds/19254/steps/run%20unit%20tests/logs/FAIL%3A%20libc%2B%2B%3A%3Asize.pass.cpp llvm-svn: 355302
* [InstCombine] Mark debug values as unavailable after DCE.Davide Italiano2019-03-041-1/+2
| | | | | | Fixes PR40838. llvm-svn: 355301
OpenPOWER on IntegriCloud