summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* [WebAssembly] Set threadmodel during LTOSam Clegg2018-07-022-1/+18
| | | | | | | | Subscribers: dschuff, mehdi_amini, inglorion, jgravelle-google, aheejin, sunfish, steven_wu, llvm-commits Differential Revision: https://reviews.llvm.org/D48689 llvm-svn: 336118
* Revert "[Dominators] Add the DomTreeUpdater class"Jakub Kuderski2018-07-026-1450/+0
| | | | | | | | Temporary revert because of a failing test on some buildbots. This reverts commit r336114. llvm-svn: 336117
* [WebAssembly] Convert remaining tests from elf to wasm output formatSam Clegg2018-07-023-11/+12
| | | | | | Differential Revision: https://reviews.llvm.org/D48748 llvm-svn: 336116
* Follow up of r335953 - [ARM][AArch64] Armv8.4-A EnablementSjoerd Meijer2018-07-022-2/+3
| | | | | | Imply dotprod for armv8.4-a, because it is mandatory from v8.4. llvm-svn: 336115
* [Dominators] Add the DomTreeUpdater classJakub Kuderski2018-07-026-0/+1450
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch is the first in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]]. This patch introduces the DomTreeUpdater class, which provides a cleaner API to perform updates on available dominator trees (none, only DomTree, only PostDomTree, both) using different update strategies (eagerly or lazily) to simplify the updating process. —Prior to the patch— - Directly calling update functions of DominatorTree updates the data structure eagerly while DeferredDominance does updates lazily. - DeferredDominance class cannot be used when a PostDominatorTree also needs to be updated. - Functions receiving DT/DDT need to branch a lot which is currently necessary. - Functions using both DomTree and PostDomTree need to call the update function separately on both trees. - People need to construct an additional DeferredDominance class to use functions only receiving DDT. —After the patch— Patch by Chijun Sima <simachijun@gmail.com>. Reviewers: kuhar, brzycki, dmgreen, grosser, davide Reviewed By: kuhar, brzycki Subscribers: vsk, mgorny, llvm-commits Author: NutshellySima Differential Revision: https://reviews.llvm.org/D48383 llvm-svn: 336114
* [X86][SSE] Blend any v8i16/v4i32 shift with 2 shift unique valuesSimon Pilgrim2018-07-022-69/+34
| | | | | | We were only doing this for basic blends, despite shuffle lowering now being good enough to handle more complex blends. This means that the two v8i16 splat shifts are performed in parallel instead of serially as the general shift case. llvm-svn: 336113
* [X86][SSE] Add v8i16 shift test for 2 shift values that doesn't match basic ↵Simon Pilgrim2018-07-021-0/+32
| | | | | | | | blend We have special case support for 2 shift values for basic blends, but irregular shift patterns end up using the generic lowering, despite shuffle lowering being good enough to handle more complex blends. llvm-svn: 336112
* [ValueTracking] allow undef elements when matching vector absSanjay Patel2018-07-022-36/+31
| | | | llvm-svn: 336111
* Disable failing test on x86_64-pc-windows-gnu, see PR38006.Yaron Keren2018-07-021-1/+1
| | | | llvm-svn: 336110
* [CodeGen] Make block removal order deterministic in CodeGenPrepareDavid Stenberg2018-07-021-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Replace use of a SmallPtrSet with a SmallSetVector to make the worklist iteration order deterministic. This is done as the order the blocks are removed may affect whether or not PHI nodes in successor blocks are removed. For example, consider the following case where %bb1 and %bb2 are removed: bb1: br i1 undef, label %bb3, label %bb4 bb2: br i1 undef, label %bb4, label %bb3 bb3: pv1 = phi type [ undef, %bb1 ], [ undef, %bb2], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ undef, %bb1 ], [ undef, %bb2 ], [ pv1, %bb3 ], [ v0, %other ] If %bb2 is removed before %bb1, the incoming values from %bb1 and %bb2 to pv1 will be removed before %bb1 is removed as a predecessor to %bb4. The pv1 node will thus be optimized out (to v0) at the time %bb1 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb1 has been removed: bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] The pv2 PHI node will be optimized away by removePredecessor() as all incoming values are identical. In case %bb2 is removed after %bb1, pv1 will not be optimized out at the time %bb2 is removed as a predecessor to %bb4, leaving the blocks as following when the incoming value from %bb2 to pv2 has been removed: bb3: pv1 = phi type [ undef, %bb2 ], [ v0, %other ] br label %bb4 bb4: pv2 = phi type [ pv1, %bb3 ], [ v0, %other ] The pv2 PHI node will thus not be removed in this case, ultimately leading to the following output bb3: ; pv1 optimized out, incoming value to pv2 is v0 br label %bb4 bb4: pv2 = phi type [ v0, %bb3 ], [ v0, %other ] I have not looked into changing DeleteDeadBlock() so that the redundant PHI nodes are removed. I have not added a test case, as I was not able to create a particularly small and (not messy) reproducer. This is likely due to SmallPtrSet behaving deterministically when in small mode. Reviewers: void, dexonsmith, spatel, skatkov, fhahn, bkramer, nhaehnle Reviewed By: fhahn Subscribers: mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D48369 llvm-svn: 336109
* Test commit accessBalazs Keri2018-07-021-1/+1
| | | | llvm-svn: 336108
* [X86] Fix test/MC/AsmParser/exprs-invalid.s after rL336104Alex Bradbury2018-07-021-1/+1
| | | | | | | | This was my mistake for only running test/MC/X86 and test/CodeGen/X86. Arguably .word should be removed from this test, as it is not supported universally. llvm-svn: 336107
* [ELF] - Cleanup error reporting code and cover with the test. NFC.George Rimar2018-07-022-2/+21
| | | | | | | | | | | | | | | | We have the following code that is uncovered with the test: https://github.com/llvm-mirror/lld/blob/master/ELF/Target.cpp#L95 This patch: 1) Removes "!IS" check. Because at that point of execution (we are reolving the relocations during writing output) we should only have InputSection type of the sections in the vector. (because we already converted MergeInputSection in mergeSections() and combined EhInputSections in combineEhFrameSections()). 2) Covers the "!IS->getParent()" with the test. llvm-svn: 336106
* [llvm-exegesis] Change how the native architecture is determinedJohn Brawn2018-07-022-2/+3
| | | | | | | | | | | | Currently the llvm-exegesis native architecture is determined by comparing the llvm native architecture with X86, so to add a new target would mean adding a new check. Change this to building up a list of the targets llvm-exegesis supports then using that, as this means that when adding a new target you just add the target to the list of supported targets. Differential Revision: https://reviews.llvm.org/D48778 llvm-svn: 336105
* [X86] Use addAliasForDirective to support the .word directive (reland)Alex Bradbury2018-07-021-25/+3
| | | | | | | | | | | | | | | | The X86 asm parser currently has custom parsing logic for .word. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon (rL332607) backends. Differential Revision: https://reviews.llvm.org/D47004 This is a fixed reland of rL336100. This should have been caught in pre-commit testing so apologies for the noise. llvm-svn: 336104
* Revert r336100Alex Bradbury2018-07-021-3/+25
| | | | | | This was a bad change. .word == 2byte on x86. llvm-svn: 336103
* [SLPVectorizer] Remove nullptr early-outs from Instruction::ShuffleVector ↵Simon Pilgrim2018-07-021-6/+0
| | | | | | | | getEntryCost This code is only used by alternate opcodes so the InstructionsState has already confirmed that every Value is an Instruction, plus we use cast<Instruction> which will assert on failure. llvm-svn: 336102
* [InstCombine] adjust shuffle tests with IR flags; NFCSanjay Patel2018-07-021-4/+3
| | | | | | | | | Due to current limitations in constant analysis, we need flags on add or mul to show propagation for the potential transform suggested in these tests (no other binops currently report identity constants). llvm-svn: 336101
* [X86] Use addAliasForDirective to support the .word directiveAlex Bradbury2018-07-021-25/+3
| | | | | | | | | | | | | The X86 asm parser currently has custom parsing logic for .word. Rather than use this custom logic, we can just use addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue. See also similar changes to Sparc (rL333078), AArch64 (rL333077), and Hexagon (rL332607) backends. Differential Revision: https://reviews.llvm.org/D47004 llvm-svn: 336100
* [llvm-exegesis] Delegate the decision of cycle counter name to the targetJohn Brawn2018-07-023-9/+16
| | | | | | | | | | | Currently the cycle counter is taken from the subtarget schedule model, which isn't any use if the subtarget doesn't have one. Delegate the decision to the target benchmark runner, as it may know better what to do in that case, with the default being the current behaviour. Differential Revision: https://reviews.llvm.org/D48779 llvm-svn: 336099
* Recommit r328307: [IPSCCP] Use constant range information for comparisons of ↵Florian Hahn2018-07-022-124/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | | parameters. This version contains a fix to add values for which the state in ParamState change to the worklist if the state in ValueState did not change. To avoid adding the same value multiple times, mergeInValue returns true, if it added the value to the worklist. The value is added to the worklist depending on its state in ValueState. Original message: For comparisons with parameters, we can use the ParamState lattice elements which also provide constant range information. This improves the code for PR33253 further and gets us closer to use ValueLatticeElement for all values. Also, as we are using the range information in the solver directly, we do not need tryToReplaceWithConstantRange afterwards anymore. Reviewers: dberlin, mssimpso, davide, efriedma Reviewed By: mssimpso Differential Revision: https://reviews.llvm.org/D43762 llvm-svn: 336098
* [ms] Fix mangling of char16_t and char32_t to be compatible with MSVC.Nico Weber2018-07-022-10/+36
| | | | | | | | | | MSVC limits char16_t and char32_t string literal names to 32 bytes of character data, not to 32 characters. wchar_t string literal names on the other hand can get up to 64 bytes of character data. https://reviews.llvm.org/D48781 llvm-svn: 336097
* [InstCombine] add tests for shuffle-binop; NFCSanjay Patel2018-07-021-37/+256
| | | | | | This is another pattern mentioned in PR37806. llvm-svn: 336096
* [SLPVectorizer] Fix alternate opcode + shuffle cost function to correct ↵Simon Pilgrim2018-07-022-7/+27
| | | | | | | | | | handle SK_Select patterns. We were always using the opcodes of the first 2 scalars for the costs of the alternate opcode + shuffle. This made sense when we used SK_Alternate and opcodes were guaranteed to be alternating, but this fails for the more general SK_Select case. This fix exposes an issue demonstrated by the fmul_fdiv_v4f32_const test - the SLM model has v4f32 fdiv costs which are more than twice those of the f32 scalar cost, meaning that the cost model determines that the vectorization is not performant. Unfortunately it completely ignores the fact that the fdiv by a constant will be changed into a fmul by InstCombine for a much lower cost vectorization. But at least we're seeing this now... llvm-svn: 336095
* [clangd] ClangdServer::codeComplete return CodeCompleteResult, not LSP struct.Sam McCall2018-07-029-174/+207
| | | | | | | | | | | | | | | | | | | | Summary: This provides more structured information that embedders can use for rendering. ClangdLSPServer continues to call render(), so NFC. The patch is: - trivial changes to ClangdServer/ClangdLSPServer - mostly-mechanical updates to CodeCompleteTests etc for the new API - new direct tests of render() in CodeCompleteTests - tiny cleanups to CodeCompletionItem (operator<< and missing initializers) Reviewers: ioeric Subscribers: ilya-biryukov, MaskRay, jkorous, cfe-commits Differential Revision: https://reviews.llvm.org/D48821 llvm-svn: 336094
* [ELF] - Remove dead code. NFC.George Rimar2018-07-021-1/+0
| | | | | | It duplicated the default implementation. llvm-svn: 336093
* [SLPVectorizer] Only Alternate opcodes use ShuffleVector cases for ↵Simon Pilgrim2018-07-021-1/+5
| | | | | | | | getEntryCost/vectorizeTree. NFCI. Add assertions - we're already assuming this in how we use the AltOpcode and treat everything as BinaryOperators. llvm-svn: 336092
* [AArch64][SVE] Asm: Support for (SQ)INCP/DECP (scalar, vector)Sander de Smalen2018-07-0213-0/+763
| | | | | | | | | | | | | | | | | | | | | Increments/decrements the result with the number of active bits from the predicate. The inc/dec variants added are: - incp x0, p0.h (scalar) - incp z0.h, p0 (vector) The unsigned saturating inc/dec variants added are: - uqincp x0, p0.h (scalar) - uqincp w0, p0.h (scalar, 32bit) - uqincp z0.h, p0 (vector) The signed saturating inc/dec variants added are: - sqincp x0, p0.h (scalar) - sqincp x0, p0.h, w0 (scalar, 32bit) - sqincp z0.h, p0 (vector) llvm-svn: 336091
* [AArch64][SVE] Asm: Support for (saturating) vector INC/DEC instructions.Sander de Smalen2018-07-0237-0/+750
| | | | | | | | | | | | | | | | | | | Increment/decrement vector by multiple of predicate constraint element count. The variants added by this patch are: - INCH, INCW, INC and (saturating): - SQINCH, SQINCW, SQINCD - UQINCH, UQINCW, UQINCW - SQDECH, SQINCW, SQINCD - UQDECH, UQINCW, UQINCW For example: incw z0.s, all, mul #4 llvm-svn: 336090
* [X86][BtVer2] Added Jaguar FPU Pipe0/1 uop counters to permit basic ↵Simon Pilgrim2018-07-021-0/+2
| | | | | | | | llvm-exegesis uop testing We don't have PMCs to cover many of the Jaguar resources but we can at least monitor the FPU issue pipes which give an indication of the fpu uop count, just not the execution resources. llvm-svn: 336089
* [OMPT] Use alloca() to force availability of frame pointerJoachim Protze2018-07-021-0/+4
| | | | | | | | | | | | | When compiling with icc, there is a problem with reenter frame addresses in parallel_begin callbacks in the interoperability.c testcase. (The address is not available. thus NULL) Using alloca() forces availability of the frame pointer. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48282 llvm-svn: 336088
* [OMPT] Add tests for runtime entry points from non-OpenMP threadsJoachim Protze2018-07-021-12/+53
| | | | | | | | | | | Several runtime entry points have not been tested from non-OpenMP threads. This adds tests to an existing testcase. While at it, the testcase was reformatted Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D48124 llvm-svn: 336087
* [OMPT] Add testcases for thread_begin and thread_end callbacksJoachim Protze2018-07-022-0/+72
| | | | | | | | | | | Especially the thread_end callback has not been tested before. This adds a testcase for nested and non-nested threads. Patch provided by Simon Convent Differential Revision: https://reviews.llvm.org/D47824 llvm-svn: 336086
* [OMPT] Provide the right thread_num for ancestor levelsJoachim Protze2018-07-022-3/+371
| | | | | | | | | The current implementation always provides the thread-num for the current parallel region. This patch fixes the behavior for ancestor levels >0. Differential Revision: https://reviews.llvm.org/D46533 llvm-svn: 336085
* [Mips][FastISel] Do not duplicate condition while lowering branchesPetar Jovanovic2018-07-022-4/+29
| | | | | | | | | | | | | | | | This change fixes the issue that arises when we duplicate condition from the predecessor block. If the condition's arguments are not considered alive across the blocks, fast regalloc gets confused and starts generating reloads from the slots that have never been spilled to. This change also leads to smaller code given that, unlike on architectures with condition codes, on Mips we can branch directly on register value, thus we gain nothing by duplication. Patch by Dragan Mladjenovic. Differential Revision: https://reviews.llvm.org/D48642 llvm-svn: 336084
* Fix for r336080: Missing colon in REQUIRES linePhilip Pfaffe2018-07-021-1/+1
| | | | llvm-svn: 336083
* [ELF] - Change dyn_cast to cast. NFC.George Rimar2018-07-021-1/+1
| | | | | | | | This is followup for r335958. Thanks to Rui for noticing. llvm-svn: 336082
* [AArch64][SVE] Asm: Support for vector element compares (immediate).Sander de Smalen2018-07-0222-0/+704
| | | | | | | | Compare vector elements with a signed/unsigned immediate, e.g. cmpgt p0.s, p0/z, z0.s, #-16 cmphi p0.s, p0/z, z0.s, #127 llvm-svn: 336081
* [polly-acc] change cl_get_* return types to 32/64bitPhilip Pfaffe2018-07-022-9/+107
| | | | | | | | | | | | | | | | | | | Summary: This patch changes the return types for ocl_get_* functions during SPIR code generation. Because these functions return size_t types, the return type needs to be changed to the actual size of size_t on the device. Based on work by Michal Babej and Pekka Jääskeläinen Patch by: Alain Denzler Reviewers: grosser, philip.pfaffe, bollu Reviewed By: grosser, philip.pfaffe Subscribers: nemanjai, kbarton, llvm-commits Differential Revision: https://reviews.llvm.org/D48774 llvm-svn: 336080
* Reapply r334980 and r334983.Sander de Smalen2018-07-0248-110/+1126
| | | | | | | | | These patches were previously reverted as they led to buildbot time-outs caused by large switch statement in printAliasInstr when using UBSan and O3. The issue has been addressed with a workaround (r335525). llvm-svn: 336079
* [NFC] Test that shows unprofitability of instcombine with bit rangesMax Kazantsev2018-07-021-0/+32
| | | | llvm-svn: 336078
* [X86] Put some cases in switch statements back on one line to be more ↵Craig Topper2018-07-021-566/+186
| | | | | | | | compact and make it easier to see the similarities. NFC It looks like someone ran clang-format over this entire file which reformatted these switches into a multiline form. But I think the single line form is more useful here. llvm-svn: 336077
* [llvm-exegesis][NFC] Cleanup useless braces.Clement Courbet2018-07-021-16/+8
| | | | llvm-svn: 336076
* [X86] Remove FMA3Info DenseMap. Break into sorted tables that we can binary ↵Craig Topper2018-07-024-234/+149
| | | | | | | | | | search. I separated out the rounding and broadcast groups into their own tables because it made the ordering in the main table easier. Further splitting of the tables might make it possible to directly index using bits from the TSFlags, but its probably not worth it right now. llvm-svn: 336075
* [PowerPC] Don't make it as pre-inc candidate if displacement isn't 4's ↵QingShan Zhang2018-07-022-0/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | multiple for i64 pre-inc load/store For the below case, pre-inc prep think it's a good candidate to use pre-inc for the bucket, but 64bit integer load/store update (pre-inc) instruction on Power requires the displacement field should be DS-form (4's multiple). Since it can't satisfy the constraint, we have to do some fix ups later. As below, the original load/stores could be well-form, it makes things worse. unsigned long long result = 0; unsigned long long foo(char *p, unsigned long long n) { for (unsigned long long i = 0; i < n; i++) { unsigned long long x1 = *(unsigned long long *)(p - 50000 + i); unsigned long long x2 = *(unsigned long long *)(p - 61024 + i); unsigned long long x3 = *(unsigned long long *)(p - 62048 + i); unsigned long long x4 = *(unsigned long long *)(p - 64096 + i); result *= x1 * x2 * x3 * x4; } return result; } Patch by jedilyn(Kewen Lin). Differential Revision: https://reviews.llvm.org/D48813 --This line, and those below, will be ignored-- M lib/Target/PowerPC/PPCLoopPreIncPrep.cpp A test/CodeGen/PowerPC/preincprep-i64-check.ll llvm-svn: 336074
* Implement strip.invariant.groupPiotr Padlewski2018-07-0221-48/+296
| | | | | | | | | | | | | | | | Summary: This patch introduce new intrinsic - strip.invariant.group that was described in the RFC: Devirtualization v2 Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits Differential Revision: https://reviews.llvm.org/D47103 Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com> llvm-svn: 336073
* Add an entry for rodata constant merge sections to the defaultEric Christopher2018-07-022-2/+7
| | | | | | | | | | section flags in the ELF assembler. This matches the defaults given in the rest of MC. Fixes PR37997 where we couldn't assemble our own assembly output without warnings. llvm-svn: 336072
* [X86] Fix a few test names in avx512-intrinsics-fast-isel.ll to match their ↵Craig Topper2018-07-011-8/+8
| | | | | | | | clang intrinsic names. I thought I fixed these yesterday, but I guess I missed a few. llvm-svn: 336071
* [X86] Remove the places that return nullptr from ↵Craig Topper2018-07-011-44/+10
| | | | | | | | X86InstrInfo::commuteInstructionImpl. findCommutedOpIndices does the pre-checking for whether commuting is possible. There should be no reason left to fail in commuteInstructionImpl. There was a missing pre-check that I've added there and changed the check to an assert in commuteInstructionImpl. llvm-svn: 336070
* [SLPVectorizer] Call InstructionsState.isOpcodeOrAlt with Instruction ↵Simon Pilgrim2018-07-011-11/+9
| | | | | | instead of an opcode. NFCI. llvm-svn: 336069
OpenPOWER on IntegriCloud