summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
...
* Utilize new SDNode flag functionality to expand current support for fdivMichael Berg2018-06-152-14/+28
| | | | | | | | | | | | | | Summary: This patch originated from D46562 and is a proper subset, with some issues addressed. Reviewers: spatel, hfinkel, wristow, arsenm Reviewed By: spatel Subscribers: wdng, nhaehnle Differential Revision: https://reviews.llvm.org/D47954 llvm-svn: 334862
* [PowerPC] Add support for high and higha symbol modifiers on tls modifers.Sean Fertile2018-06-153-1/+313
| | | | | | | | | | Enables using the high and high-adjusted symbol modifiers on thread local storage modifers in powerpc assembly. Needed to be able to support 64 bit thread-pointer and dynamic-thread-pointer access sequences. Differential Revision: https://reviews.llvm.org/D47754 llvm-svn: 334856
* [PPC64] Support "symbol@high" and "symbol@higha" symbol modifers.Sean Fertile2018-06-152-5/+25
| | | | | | | | | | Add support for the "@high" and "@higha" symbol modifiers in powerpc64 assembly. The modifiers represent accessing the segment consiting of bits 16-31 of a 64-bit address/offset. Differential Revision: https://reviews.llvm.org/D47729 llvm-svn: 334855
* Move redundant-vf2-cost.ll test to X86 directoryDiego Caballero2018-06-151-0/+0
| | | | | | | | redundant-vf2-cost.ll is X86 specific. Moved from test/Transforms/LoopVectorize/redundant-vf2-cost.ll to test/Transforms/LoopVectorize/X86/redundant-vf2-cost.ll llvm-svn: 334854
* [llvm-mca][x86] Add Generic cpu resource testsSimon Pilgrim2018-06-1521-0/+9934
| | | | | | | | Added a Generic x86 cpu set of resource tests to allow us to check all ISAs. We currently use SandyBridge as our generic CPU model, but it's better if we actually duplicate these tests for if/when we change the model, it also means we don't end up polluting the SandyBridge folder with tests for ISAs it doesn't support. llvm-svn: 334853
* [X86] Lowering sqrt intrinsics to native IRTomasz Krupa2018-06-1517-260/+423
| | | | | | | | | | | | | | Summary: Complementary patch to lowering sqrt intrinsics in Clang. Reviewers: craig.topper, spatel, RKSimon, DavidKreitzer, uriel.k Reviewed By: craig.topper Subscribers: tkrupa, mike.dvoretsky, llvm-commits Differential Revision: https://reviews.llvm.org/D41599 llvm-svn: 334849
* [X86] Prevent folding stack reloads into instructions in hasUndefRegUpdate.Craig Topper2018-06-152-13/+20
| | | | | | An earlier commit prevented folds from the peephole pass by checking for IMPLICIT_DEF. But later in the pipeline IMPLICIT_DEF just becomes and Undef flag on the input register so we need to check for that case too. llvm-svn: 334848
* Remove <undef> from rematerialized full registerKrzysztof Parzyszek2018-06-151-0/+37
| | | | | | | | | | | When coalescing a small register into a subregister of a larger register, if the larger register is rematerialized, the function updateRegDefUses can add an <undef> flag to the rematerialized definition (since it's treating it as only definining the coalesced subregister). While with that assumption doing so is not incorrect, make sure to remove the flag later on after the call to updateRegDefUses. llvm-svn: 334845
* [InstCombine] Avoid iteration/mutation conflictJoseph Tremoulet2018-06-151-0/+38
| | | | | | | | | | | | | | | | | | Summary: When iterating users of a multiply in processUMulZExtIdiom, the call to setOperand in the truncation case may replace the use being visited; make sure the iterator has been advanced before doing that replacement. Reviewers: majnemer, davide Reviewed By: davide Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D48192 llvm-svn: 334844
* [AArch64][SVE] Asm: Support for CPY SIMD/FP and GPR instructions.Sander de Smalen2018-06-154-0/+368
| | | | | | | Predicated splat/copy of SIMD/FP register or general purpose register to SVE vector, along with MOV-aliases. llvm-svn: 334842
* [LV] Prevent LV to run cost model twice for VF=2Diego Caballero2018-06-151-0/+34
| | | | | | | | | | | | | | This is a minor fix for LV cost model, where the cost for VF=2 was computed twice when the vectorization of the loop was forced without specifying a VF. Reviewers: xusx595, hsaito, fhahn, mkuper Reviewed By: hsaito, xusx595 Differential Revision: https://reviews.llvm.org/D48048 llvm-svn: 334840
* [AArch64][SVE] Asm: Support for INC/DEC (scalar) instructions.Sander de Smalen2018-06-1516-0/+1558
| | | | | | | | | | | | | Increment/decrement scalar register by (scaled) element count given by predicate pattern, e.g. 'incw x0, all, mul #4'. Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D47713 llvm-svn: 334838
* AMDGPU: Add combine for short vector extract_vector_eltsMatt Arsenault2018-06-153-0/+134
| | | | | | | | | | Try to access pieces 4 bytes at a time. This helps various hasOneUse extract_vector_elt combines, such as load width reductions. Avoids test regressions in a future commit. llvm-svn: 334836
* AMDGPU: Make v4i16/v4f16 legalMatt Arsenault2018-06-1516-175/+392
| | | | | | | Some image loads return these, and it's awkward working around them not being legal. llvm-svn: 334835
* [MCA] Add -summary-view optionRoman Lebedev2018-06-154-72/+18
| | | | | | | | | | | | | | | | | | | Summary: While that is indeed a quite interesting summary stat, there are cases where it does not really add anything other than consuming extra lines. Declutters the output of D48190. Reviewers: RKSimon, andreadb, courbet, craig.topper Reviewed By: andreadb Subscribers: javed.absar, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48209 llvm-svn: 334833
* [MCA][x86][NFC] Add tests for -register-file-stats, -scheduler-statsRoman Lebedev2018-06-152-0/+134
| | | | | | | | | | | | | | | | Summary: There does not seem to be any other tests for this. Split off from D47676. Reviewers: RKSimon, craig.topper, courbet, andreadb Reviewed By: andreadb Subscribers: javed.absar, gbedwell, llvm-commits Differential Revision: https://reviews.llvm.org/D48190 llvm-svn: 334832
* [AArch64][SVE] Asm: Support for FADD, FMUL and FMAX immediate instructions.Sander de Smalen2018-06-156-0/+249
| | | | | | | | | | Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar Reviewed By: javed.absar Differential Revision: https://reviews.llvm.org/D47712 llvm-svn: 334831
* Re-apply "[DebugInfo] Check size of variable in ConvertDebugDeclareToDebugValue"Bjorn Pettersson2018-06-153-2/+107
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is r334704 (which was reverted in r334732) with a fix for types like x86_fp80. We need to use getTypeAllocSizeInBits and not getTypeStoreSizeInBits to avoid dropping debug info for such types. Original commit msg: > Summary: > Do not convert a DbgDeclare to DbgValue if the store > instruction only refer to a fragment of the variable > described by the DbgDeclare. > > Problem was seen when for example having an alloca for an > array or struct, and there were stores to individual elements. > In the past we inserted a DbgValue intrinsics for each store, > just as if the store wrote the whole variable. > > When handling store instructions we insert a DbgValue that > indicates that the variable is "undefined", as we do not know > which part of the variable that is updated by the store. > > When ConvertDebugDeclareToDebugValue is used with a load/phi > instruction we assert that the referenced value is large enough > to cover the whole variable. Afaict this should be true for all > scenarios where those methods are used on trunk. If the assert > blows in the future I guess we could simply skip to insert a > dbg.value instruction. > > In the future I think we should examine which part of the variable > that is accessed, and add a DbgValue instrinsic with an appropriate > DW_OP_LLVM_fragment expression. > > Reviewers: dblaikie, aprantl, rnk > > Reviewed By: aprantl > > Subscribers: JDevlieghere, llvm-commits > > Tags: #debug-info > > Differential Revision: https://reviews.llvm.org/D48024 llvm-svn: 334830
* DAG: Fix creating concat_vectors with illegal typeMatt Arsenault2018-06-151-99/+110
| | | | | | | Test passes as is, but fails with future patch to make v4i16/v4f16 legal. llvm-svn: 334823
* [SLP][X86] Add AVX2 run to POW2 SDIV TestsSimon Pilgrim2018-06-151-1/+2
| | | | | | Non-uniform pow2 tests are only make sense on targets with fast (low cost) non-uniform shifts llvm-svn: 334821
* [SLP][X86] Regenerate POW2 SDIV TestsSimon Pilgrim2018-06-151-9/+91
| | | | | | Added non-uniform pow2 test as well llvm-svn: 334819
* [InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y)Roman Lebedev2018-06-152-14/+13
| | | | | | | | | | | | | | | | | | | | | | | | Summary: We already do it for splat constants, but not just values. Also, undef cases are mostly non-functional. The original commit was reverted because it broke tests for amdgpu backend, which i didn't check. Now, the backed was updated to recognize these new patterns, so we are good. https://bugs.llvm.org/show_bug.cgi?id=37603 https://rise4fun.com/Alive/cplX Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm Reviewed By: spatel, rampitec, nhaehnle Subscribers: wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D47980 llvm-svn: 334818
* [AMDGPU] Recognize x & ~(-1 << y) pattern.Roman Lebedev2018-06-151-45/+15
| | | | | | | | | | | | | | | | Summary: The same pattern as D48010, but this one is IR-canonical as of D47428. Reviewers: nhaehnle, bogner, tstellar, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #amdgpu Differential Revision: https://reviews.llvm.org/D48012 llvm-svn: 334817
* [AMDGPU] Recognize x & ((1 << y) - 1) pattern.Roman Lebedev2018-06-151-39/+15
| | | | | | | | | | | | | | | | | | | | | Summary: As a followup for D48007. Since we already handle `x << (bitwidth - y) >> (bitwidth - y)` pattern, which does not have ub for both the edge cases (`y == 0`, `y == bitwidth`), i think also handling a pattern that is ub for `y == bitwidth` should be fine. Reviewers: nhaehnle, bogner, tstellar, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #amdgpu Differential Revision: https://reviews.llvm.org/D48010 llvm-svn: 334816
* [AMDGPU] Recognize x & (-1 >> (32 - y)) pattern.Roman Lebedev2018-06-151-30/+10
| | | | | | | | | | | | | | | | | | | | | Summary: D47980 will canonicalize the `x << (32 - y) >> (32 - y)`, which is the pattern the AMDGPU expects to `x & (-1 >> (32 - y))`, which is not recognized by AMDGPU. Thus, it needs to be recognized, too. Reviewers: nhaehnle, bogner, tstellar, arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #amdgpu Differential Revision: https://reviews.llvm.org/D48007 llvm-svn: 334815
* NFC: Regenerating x86-sse41.ll test for InstCombineMikhail Dvoretckii2018-06-151-4/+4
| | | | | | Test regenerated to reduce noise in further patches. llvm-svn: 334806
* Revert r334802 "[X86] Prevent folding stack reloads with instructions that ↵Craig Topper2018-06-152-12/+13
| | | | | | | | have an undefined register update." There's a typo causing the build to fail. llvm-svn: 334803
* [X86] Prevent folding stack reloads with instructions that have an undefined ↵Craig Topper2018-06-152-13/+12
| | | | | | | | register update. We want to keep the load unfolded so we can use the same register for both sources to avoid a false dependency. llvm-svn: 334802
* [X86] Add more instructions to the memory folding tables using the ↵Craig Topper2018-06-155-34/+17
| | | | | | | | | | autogenerated table as a guide. I think this covers most of the unmasked vector instructions. We're still missing a lot of the masked instructions. There are some test changes here because of the new folding support. I don't think these particular cases should be folded because it creates an undef register dependency. I think the changes introduced in r334175 are not handling stack folding. They're only blocking the peephole pass. llvm-svn: 334800
* [X86] Fix some checks to use X86 instead of X32.Craig Topper2018-06-151-96/+96
| | | | | | These tests were recently updated so it looks like gone wrong. llvm-svn: 334786
* Make uitofp and sitofp defined on overflow.Eli Friedman2018-06-141-2/+2
| | | | | | | | | | | IEEE 754 defines the expected result on overflow. As far as I know, hardware implementations (of f16), and compiler-rt (__floatuntisf) correctly return +-Inf on overflow. And I can't think of any useful transform that would take advantage of overflow being undefined here. Differential Revision: https://reviews.llvm.org/D47807 llvm-svn: 334777
* easing the constraint for isNegatibleForFree and GetNegatedExpressionMichael Berg2018-06-141-7/+5
| | | | | | | | | | | | | | | | | Summary: Here we relax the old constraint which utilized unsafe with the TargetOption flag HonorSignDependentRoundingFPMathOption, with the assertion that unsafe is no longer needed or never was required for correctness on FDIV/FMUL. Reviewers: spatel, hfinkel, wristow, arsenm, javed.absar Reviewed By: spatel Subscribers: efriedma, wdng, tpr Differential Revision: https://reviews.llvm.org/D48057 llvm-svn: 334769
* [MSSA] Print more optimization informationGeorge Burgess IV2018-06-141-4/+4
| | | | | | | | | | | | | | In particular, when asked to print a MemoryAccess, we'll now print where defs are optimized to, and we'll print optimized access types. This patch also introduces an operator<< to make printing AliasResults easier. Patch by Juneyoung Lee! Differential Revision: https://reviews.llvm.org/D47860 llvm-svn: 334760
* [x86] be more selective about converting 'and' to shuffle (PR37749)Sanjay Patel2018-06-141-20/+20
| | | | | | | | | | | | | | | | | isVectorClearMaskLegal() is the TLI hook used by the generic DAGCombiner::XformToShuffleWithZero(). We've grown to accomodate/expect this transform to shuffle (disabling it more generally results in many regressions). So I'm narrowly excluding the 256-bit types that clearly are not worthwhile for AVX1. I think in most cases we are able to recover by converting the shuffle back into 'and' ops, but the cases in: https://bugs.llvm.org/show_bug.cgi?id=37749 ...show that there are cracks. llvm-svn: 334759
* AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.cvt.pkrtzTom Stellard2018-06-141-0/+44
| | | | | | | | | | | | Reviewers: arsenm, nhaehnle Reviewed By: arsenm Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D45907 llvm-svn: 334757
* Re-apply "[VirtRegRewriter] Avoid clobbering registers when expanding copy ↵Justin Bogner2018-06-142-0/+96
| | | | | | | | | | | | | | | | bundles" This is r334750 (which was reverted in r334754) with a fix for an uninitialized variable that was caught by msan. Original commit message: > If a copy bundle happens to involve overlapping registers, we can end > up with emitting the copies in an order that ends up clobbering some > of the subregisters. Since instructions in the copy bundle > semantically happen at the same time, this is incorrect and we need to > make sure we order the copies such that this doesn't happen. llvm-svn: 334756
* Revert "[VirtRegRewriter] Avoid clobbering registers when expanding copy ↵Justin Bogner2018-06-142-96/+0
| | | | | | | | | | | | bundles" There's an msan failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/19549 This reverts r334750. llvm-svn: 334754
* updating isNegatibleForFree and GetNegatedExpression with fmf for faddMichael Berg2018-06-141-0/+11
| | | | | | | | | | | | | | Summary: A FMF constraint is added to FADD with unsafe still available as the fallback Reviewers: spatel, wristow, arsenm, hfinkel Reviewed By: spatel Subscribers: wdng Differential Revision: https://reviews.llvm.org/D48180 llvm-svn: 334753
* [WebAssembly] Ignore explicit section names for functionsSam Clegg2018-06-141-0/+28
| | | | | | | | | | | | | | | WebAssembly doesn't support more than one function per section and we rely on function sections being unique. This change ignores the section provided by the function to avoid two functions being in the same section. Without this change the object writer produces the following error for this test: LLVM ERROR: section already has a defining function: baz Differential Revision: https://reviews.llvm.org/D48178 llvm-svn: 334752
* [VirtRegRewriter] Avoid clobbering registers when expanding copy bundlesJustin Bogner2018-06-142-0/+96
| | | | | | | | | | | | If a copy bundle happens to involve overlapping registers, we can end up with emitting the copies in an order that ends up clobbering some of the subregisters. Since instructions in the copy bundle semantically happen at the same time, this is incorrect and we need to make sure we order the copies such that this doesn't happen. Differential Revision: https://reviews.llvm.org/D48154 llvm-svn: 334750
* [x86] add tests for AVX1 FP logic op abuse (PR37749); NFCSanjay Patel2018-06-141-114/+289
| | | | | | Also, add a RUN for AVX2 to make sure that's good. llvm-svn: 334744
* [llvm-mca] Add tests for instructions that implicitly clear the upper ↵Andrea Di Biagio2018-06-142-0/+176
| | | | | | | | | | | | | portion of a super-register. On x86-64, a write to register EAX implicitly clears the upper half or RAX. 128-bit AVX instructions clear the upper 128-bit of the YMM register that aliases the XMM definition register. llvm-mca doesn't know about register writes that implicitly clear the upper portion of an aliasing super-register. This issue will be fixed in a future patch. llvm-svn: 334742
* [X86] Lowering Mask Scalar intrinsics to native IR (LLVM part)Tomasz Krupa2018-06-142-0/+783
| | | | | | | | | | | | | | | Summary: Complementary patch to lowering add, sub, mul and div mask scalar intrinsics in Clang. Reviewers: craig.topper, sroland, spatel, RKSimon Reviewed by: craig.topper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D47978 llvm-svn: 334740
* [SCEV] Simplify zext/trunc idiom that appears when handling bitmasks.Justin Lebar2018-06-142-2/+16
| | | | | | | | | | | | | | | | | | | Summary: Specifically, we transform zext(2^K * (trunc X to iN)) to iM -> 2^K * (zext(trunc X to i{N-K}) to iM)<nuw> This is helpful because pulling the 2^K out of the zext allows further optimizations. Reviewers: sanjoy Subscribers: hiraditya, llvm-commits, timshen Differential Revision: https://reviews.llvm.org/D48158 llvm-svn: 334737
* [SCEV] Simplify trunc-of-add/mul to add/mul-of-trunc under more circumstances.Justin Lebar2018-06-145-8/+34
| | | | | | | | | | | | | | | | | | | | Summary: Previously we would do this simplification only if it did not introduce any new truncs (excepting new truncs which replace other cast ops). This change weakens this condition: If the number of truncs stays the same, but we're able to transform trunc(X + Y) to X + trunc(Y), that's still simpler, and it may open up additional transformations. While we're here, also clean up some duplicated code. Reviewers: sanjoy Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D48160 llvm-svn: 334736
* Revert rL334704: "[DebugInfo] Check size of variable in ↵Bjorn Pettersson2018-06-142-50/+2
| | | | | | | | | | ConvertDebugDeclareToDebugValue" This reverts commit r334704. Buildbots detected an assertion in "test tsan in debug compiler-rt build". llvm-svn: 334732
* [llvm-mca] Add another test for partial register stalls.Andrea Di Biagio2018-06-141-0/+43
| | | | | | | | | This test checks that a physical register is correctly allocated for the partial write to register BX. The ADD instruction has to wait for the write to RBX (and BX) before being executed. llvm-svn: 334730
* [X86] Add more vector instructions to the memory folding table using the ↵Craig Topper2018-06-141-42/+19
| | | | | | | | autogenerated table as a guide. The test cahnge is because we now fold stack reload into RNDSCALE and RNDSCALE can be turned into ROUND by EVEX->VEX. llvm-svn: 334728
* [CostModel][AArch64] Add cost tests for ALTERNATE/SELECT style shuffle masksSimon Pilgrim2018-06-141-0/+95
| | | | | | Precursor to fixing a regression with SLP vectorizer for supporting SELECT shuffles (vs the current ALTERNATE) llvm-svn: 334714
* [DWARFv5] Tolerate files not all having an MD5 checksum.Paul Robinson2018-06-144-12/+58
| | | | | | | | | | | | | | | | | | In some cases, for example when compiling a preprocessed file, the front-end is not able to provide an MD5 checksum for all files. When that happens, omit the MD5 checksums from the final DWARF, because DWARF doesn't have a way to indicate that some but not all files have a checksum. When assembling a .s file, and some but not all .file directives provide an MD5 checksum, issue a warning and don't emit MD5 into the DWARF. Fixes PR37623. Differential Revision: https://reviews.llvm.org/D48135 llvm-svn: 334710
OpenPOWER on IntegriCloud