summaryrefslogtreecommitdiffstats
path: root/llvm/lib
Commit message (Collapse)AuthorAgeFilesLines
...
* [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactorFlorian Hahn2019-05-071-2/+3
| | | | | | | | | | | | | | | | | | When simplifying TokenFactors, we potentially iterate over all operands of a large number of TokenFactors. This causes quadratic compile times in some cases and the large token factors cause additional scalability problems elsewhere. This patch adds some limits to the number of nodes explored for the cases mentioned above. Reviewers: niravd, spatel, craig.topper Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D61397 llvm-svn: 360171
* Avoid use-after-move warnings by using swap instead. NFCI.Simon Pilgrim2019-05-072-3/+7
| | | | | | Swap should be as quick in these cases, and leaves the original variables in a known (empty) state. llvm-svn: 360164
* [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through ↵Orlando Cazalet-Hyams2019-05-072-7/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel Reviewed By: hfinkel Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 360162
* [SCEV] Add explicit representations of umin/sminKeno Fischer2019-05-072-213/+213
| | | | | | | | | | | | | | | | | | Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159
* Fix local shadow variable warning. NFCI.Simon Pilgrim2019-05-071-2/+2
| | | | llvm-svn: 360157
* [PowerPC] Use the two-constant NR algorithm for refining estimatesNemanja Ivanovic2019-05-074-2/+10
| | | | | | | | | | | | The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC. Differential revision: https://reviews.llvm.org/D60037 llvm-svn: 360144
* [yaml2obj] - Allow setting st_value explicitly for Symbol.George Rimar2019-05-071-4/+5
| | | | | | | | | | In some cases it is useful to explicitly set symbol's st_name value. For example, I am using it in a patch for LLD to remove the broken binary from a test case and replace it with a YAML test. Differential revision: https://reviews.llvm.org/D61180 llvm-svn: 360137
* [ARM GlobalISel] Widen G_SELECT operandsDiana Picus2019-05-071-2/+3
| | | | | | ...except for the condition operand. llvm-svn: 360135
* [X86][AVX] Fold concat(packus(),packus()) -> packus(concat(),concat()) (PR34773)Simon Pilgrim2019-05-071-0/+24
| | | | | | Basic "revectorization" combine, we can probably do more opcodes here but it can be a tricky cost-benefit depending on where the subvectors came from - but this case helps shuffle combining. llvm-svn: 360134
* Fixed "Value stored to 'Opc' is never read" warning. NFCI.Simon Pilgrim2019-05-071-1/+1
| | | | llvm-svn: 360133
* [X86] Reduce scope of variables where possible. NFCI.Simon Pilgrim2019-05-073-10/+4
| | | | | | Fixes cppcheck warnings. llvm-svn: 360131
* [ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINTDiana Picus2019-05-071-2/+6
| | | | | | | We actually have a couple of G_PTRTOINT to s8 when building clang, so we should do something about them. llvm-svn: 360130
* Fix uninitialized variable warning. NFCI.Simon Pilgrim2019-05-071-1/+1
| | | | | | This also fixes a scan-build "array subscript is undefined" warning. llvm-svn: 360128
* [ARM GlobalISel] Widen G_GEP index operandDiana Picus2019-05-071-1/+3
| | | | llvm-svn: 360127
* Test commit accessOrlando Cazalet-Hyams2019-05-071-2/+2
| | | | llvm-svn: 360125
* AMDGPU: Verify that SOP2/SOPC instructions have at most one immediate operandNicolai Haehnle2019-05-071-0/+18
| | | | | | | | | | | | | | | | | | Summary: No test case because I don't know of a way to trigger this, but I accidentally caused this to fail while working on a different change. Change-Id: I8015aa447fe27163cc4e4902205a203bd44bf7e3 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61490 llvm-svn: 360123
* [FastISel][X86] If selectFNeg fails, fall back to SelectionDAG not treating ↵Craig Topper2019-05-071-8/+9
| | | | | | | | | | | | | | | | | | | | | it as an fsub. Summary: If fneg lowering for fsub -0.0, x fails we currently fall back to treating it as an fsub. This has different behavior for nans than the xor with sign bit trick we normally try to do. On X86, the xor trick for double fails fast-isel in 32-bit mode with sse2 due to 64 bit integer types not being available. With -O2 we would always use an xorpd for this case. If we use subsd, this creates an observable behavior difference between -O0 and -O2. So fall back to SelectionDAG if we can't fast-isel it, that way SelectionDAG will use the xorpd. I believe this patch is restoring the behavior prior to r345295 from last October. This was missed then because our fast isel case in 32-bit mode aborted fast-isel earlier for another reason. But I've added new tests to cover that. Reviewers: andrew.w.kaylor, cameron.mcinally, spatel, efriedma Reviewed By: cameron.mcinally Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61622 llvm-svn: 360111
* [WebAssembly] Add more test coverage for reloctions against section symbolsSam Clegg2019-05-071-1/+2
| | | | | | | | | | | | | The only known user of this relocation type and symbol type is the debug info sections, but we were not testing the `--relocatable` output path. This change adds a minimal test case to cover relocations against section symbols includes `--relocatable` output. Differential Revision: https://reviews.llvm.org/D61623 llvm-svn: 360110
* [DebugInfo] Delete TypedDINodeRefFangrui Song2019-05-0713-133/+101
| | | | | | | | | | | | | TypedDINodeRef<T> is a redundant wrapper of Metadata * that is actually a T *. Accordingly, change DI{Node,Scope,Type}Ref uses to DI{Node,Scope,Type} * or their const variants. This allows us to delete many resolve() calls that clutter the code. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D61369 llvm-svn: 360108
* [SanitizerCoverage] Use different module ctor names for trace-pc-guard and ↵Fangrui Song2019-05-071-8/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inline-8bit-counters Fixes the main issue in PR41693 When both modes are used, two functions are created: `sancov.module_ctor`, `sancov.module_ctor.$LastUnique`, where $LastUnique is the current LastUnique counter that may be different in another module. `sancov.module_ctor.$LastUnique` belongs to the comdat group of the same name (due to the non-null third field of the ctor in llvm.global_ctors). COMDAT group section [ 9] `.group' [sancov.module_ctor] contains 6 sections: [Index] Name [ 10] .text.sancov.module_ctor [ 11] .rela.text.sancov.module_ctor [ 12] .text.sancov.module_ctor.6 [ 13] .rela.text.sancov.module_ctor.6 [ 23] .init_array.2 [ 24] .rela.init_array.2 # 2 problems: # 1) If sancov.module_ctor in this module is discarded, this group # has a relocation to a discarded section. ld.bfd and gold will # error. (Another issue: it is silently accepted by lld) # 2) The comdat group has an unstable name that may be different in # another translation unit. Even if the linker allows the dangling relocation # (with --noinhibit-exec), there will be many undesired .init_array entries COMDAT group section [ 25] `.group' [sancov.module_ctor.6] contains 2 sections: [Index] Name [ 26] .init_array.2 [ 27] .rela.init_array.2 By using different module ctor names, the associated comdat group names will also be different and thus stable across modules. Reviewed By: morehouse, phosek Differential Revision: https://reviews.llvm.org/D61510 llvm-svn: 360107
* [X86] Use extended vector register classes in getRegForInlineAsmConstraint ↵Craig Topper2019-05-061-6/+6
| | | | | | | | | | | | | | to support x/y/zmm16-31 when the type is mismatched. The FR32/FR64/VR128/VR256 register classes don't contain the upper 16 registers. For most cases we use the default implementation which will find any register class that contains the register in question if the VT is legal for the register class. But if the VT is i32 or i64, we won't find a matching register class and will instead up in the code modified in this patch. If the requested register is x/y/zmm16-31 we weren't returning a register class that contains those registers and will hit an assertion in the caller. To fix this, I've changed to use the extended register class instead. I don't believe we need a subtarget check to see if avx512 is enabled. The default implementation just pick whatever register class it finds first. I checked and we currently pick FR32X for XMM0 with an f32 type using the default implementation regardless of whether avx512 is enabled. So I assume its it is ok to do the same for i32. Differential Revision: https://reviews.llvm.org/D61457 llvm-svn: 360102
* Fix bug in getCompleteTypeIndex in codeview debug infoAmy Huang2019-05-061-4/+7
| | | | | | | | | | | | | | | | Summary: When there are multiple instances of a forward decl record type, only the first one is emitted with a type index, because the type is added to a map with a null type index. Avoid this by reordering so that forward decl types aren't added to the map. Reviewers: rnk Subscribers: aprantl, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61460 llvm-svn: 360101
* [ARM] Glue register copies to tail calls.Eli Friedman2019-05-061-26/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 llvm-svn: 360099
* [FastISel] Pass the fneg input operand to hasTrivialKill in ↵Craig Topper2019-05-061-1/+1
| | | | | | | | FastISel::selectFNeg. We're trying to calculate the kill flag for OpReg which is the input so we need to pass the input here. llvm-svn: 360097
* [AMDGPU] gfx1010 verifier changesStanislav Mekhanoshin2019-05-061-7/+15
| | | | | | Differential Revision: https://reviews.llvm.org/D61521 llvm-svn: 360095
* [AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32Stanislav Mekhanoshin2019-05-061-1/+1
| | | | | | | | | GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for all targets. Differential Revision: https://reviews.llvm.org/D61525 llvm-svn: 360094
* Fix pr33010, a 2 year old crashing regressionPhilip Reames2019-05-061-0/+4
| | | | | | | | The problem was that we were creating a CMOV64rr <TargetFrameIndex>, <TargetFrameIndex>. The entire point of a TFI is that address code is not generated, so there's no way to legalize/lower this. Instead, simply prevent it's creation. Arguably, we shouldn't be using *Target*FrameIndices in StatepointLowering at all, but that's a much deeper change. llvm-svn: 360090
* [AMDGPU] gfx1010 memory legalizerStanislav Mekhanoshin2019-05-061-1/+262
| | | | | | Differential Revision: https://reviews.llvm.org/D61535 llvm-svn: 360087
* Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also ↵Jordan Rupprecht2019-05-061-15/+14
| | | | | | | | | | sink function calls without used results (PR41259)" This reverts r357452 (git commit 21eb771dcb5c11d7500fa6ad551c97a921997f05). This was causing strange optimization-related test failures on an internal test. Will followup with more details offline. llvm-svn: 360086
* [X86] Remove the suffix on vcvt[u]si2ss/sd register variants in assembly ↵Craig Topper2019-05-062-51/+80
| | | | | | | | | | | | | | printing. We require d/q suffixes on the memory form of these instructions to disambiguate the memory size. We don't require it on the register forms, but need to support parsing both with and without it. Previously we always printed the d/q suffix on the register forms, but it's redundant and inconsistent with gcc and objdump. After this patch we should support the d/q for parsing, but not print it when its unneeded. llvm-svn: 360085
* [AArch64] Default to SEH exception handling on MinGWMartin Storsjo2019-05-061-3/+1
| | | | | | | | The SEH implementation is pretty mature at this point. Differential Revision: https://reviews.llvm.org/D61590 llvm-svn: 360080
* [InstCombine] sink FP negation of operands through selectSanjay Patel2019-05-061-0/+12
| | | | | | | | | | | | | | We don't always get this: Cond ? -X : -Y --> -(Cond ? X : Y) ...even with the legacy IR form of fneg in the case with extra uses, and we miss matching with the newer 'fneg' instruction because we are expecting binops through the rest of the path. Differential Revision: https://reviews.llvm.org/D61604 llvm-svn: 360075
* Pull out repeated CI->getCalledFunction() calls. NFCI.Simon Pilgrim2019-05-061-2/+2
| | | | llvm-svn: 360070
* [SelectionDAG][X86] Support inline assembly returning an mmx register into a ↵Craig Topper2019-05-061-0/+8
| | | | | | | | | | | | | | | | | | type with fewer than 64 bits. It's possible to use the 'y' mmx constraint with a type narrower than 64-bits. This patch supports this by bitcasting the mmx type to 64-bits and then truncating to the desired type. There are probably other missing type combinations we need to support, but this is the case we have a bug report for. Fixes PR41748. Differential Revision: https://reviews.llvm.org/D61582 llvm-svn: 360069
* [GlobalISel] Handle <1 x T> vector return types properly.Amara Emerson2019-05-061-11/+31
| | | | | | | | | | | | After support for dealing with types that need to be extended in some way was added in r358032 we didn't correctly handle <1 x T> return types. These types don't have a GISel direct representation, instead we just see them as scalars. When we need to pad them into <2 x T> types however we need to use a G_BUILD_VECTOR instead of trying to do a G_CONCAT_VECTOR. This fixes PR41738. llvm-svn: 360068
* Revert r359392 and r358887Craig Topper2019-05-065-72/+82
| | | | | | | | | | | | | | | | | | | | Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead" Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling" Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list. Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work there. I'll file a separate PR for that and add test cases. Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised if that bug can still be hit independent of that. This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again. llvm-svn: 360066
* [InstCombine] reduce code duplication; NFCSanjay Patel2019-05-061-7/+7
| | | | llvm-svn: 360059
* [ConstantRange] Add srem() supportNikita Popov2019-05-061-0/+44
| | | | | | | | | | | | | | | Add support for srem() to ConstantRange so we can use it in LVI. For srem the sign of the result matches the sign of the LHS. For the RHS only the absolute value is important. Apart from that the logic is like urem. Just like for urem this is only an approximate implementation. The tests check a few specific cases and run an exhaustive test for conservative correctness (but not exactness). Differential Revision: https://reviews.llvm.org/D61207 llvm-svn: 360055
* [SDAG][AArch64] Boolean and/or reduce to umax/min reduce (PR41635)Nikita Popov2019-05-061-0/+12
| | | | | | | | | | | This addresses one half of https://bugs.llvm.org/show_bug.cgi?id=41635 by combining a VECREDUCE_AND/OR into VECREDUCE_UMIN/UMAX (if latter is legal but former is not) for zero-or-all-ones boolean reductions (which are detected based on sign bits). Differential Revision: https://reviews.llvm.org/D61398 llvm-svn: 360054
* Add FNeg support to InstructionSimplifyCameron McInally2019-05-061-0/+65
| | | | | | Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053
* [InstCombine] reduce code duplication; NFCISanjay Patel2019-05-061-22/+19
| | | | llvm-svn: 360051
* Modernize repmovsb implementation of x86 memcpy and allow runtime sizes.Guillaume Chatelet2019-05-061-97/+118
| | | | | | | | | | | | | | | | | | | | Summary: This is a prerequisite to RFC http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61593 Fix typo. Turn this patch into an NFC. Addressing comments llvm-svn: 360050
* [X86] Fix uninitialized members in constructor warnings. NFCI.Simon Pilgrim2019-05-062-3/+3
| | | | | | Initialize all member variables in X86ATTInstPrinter and X86DAGToDAGISel constructors to fix cppcheck warning. llvm-svn: 360047
* Fix compilation warnings when compiling with GCC 7.3Alexandre Ganea2019-05-061-0/+6
| | | | | | Differential Revision: https://reviews.llvm.org/D61046 llvm-svn: 360044
* [PowerPC] Fix erroneous condition for converting uint-to-fp vector conversionNemanja Ivanovic2019-05-061-3/+2
| | | | | | | | | | | | | | A condition for exiting the legalization of v4i32 conversion to v2f64 through extract/convert/build erroneously checks for the extract having type i32. This is not adequate as smaller extracts are actually legalized to i32 as well. Furthermore, an early exit is missing which means that we only check that both extracts are from the same vector if that check fails. As a result, both cases in the included test case fail - the first gets a select error and the second generates incorrect code. The culprit commit is r274535. llvm-svn: 360043
* X86DAGToDAGISel::tryVPTESTM - fix uninitialized variable warning. NFCI.Simon Pilgrim2019-05-061-1/+1
| | | | | | findBroadcastedOp should always initialize the value if it returns true but static-analyzer isn't great at recognising this. llvm-svn: 360037
* [LoadStoreVectorizer] vectorizeStoreChain - ensure we find a store type.Simon Pilgrim2019-05-061-1/+2
| | | | | | | | Properly initialize store type to null then ensure we find a real store type in the chain. Fixes scan-build null dereference warning and makes the code clearer. llvm-svn: 360031
* [X86] X86InstrInfo::findThreeSrcCommutedOpIndices - fix unread variable warning.Simon Pilgrim2019-05-061-1/+2
| | | | | | scan-build was reporting that CommutableOpIdx1 never used its original initialized value - move it down to where its first used to make the real initialization more obvious (and matches the comment that's there). llvm-svn: 360028
* [X86] lowerVectorShuffle - use any_of to detect out of bounds shuffle ↵Simon Pilgrim2019-05-061-9/+8
| | | | | | | | indices. NFCI. Fixes cppcheck local shadow warning as well. llvm-svn: 360027
* [SimplifyLibCalls] Simplify bcmp too.Clement Courbet2019-05-061-1/+19
| | | | | | | | | | | | | | Summary: Fixes PR40699. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61585 llvm-svn: 360021
OpenPOWER on IntegriCloud