summaryrefslogtreecommitdiffstats
path: root/llvm/test
Commit message (Collapse)AuthorAgeFilesLines
* [AArch64][SVE] Asm: Improve diagnostics further when +sve is not specifiedSander de Smalen2017-12-186-69/+69
| | | | | | | | | | | | | | Summary: Patch [4/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. This patch further improves diagnostic messages for when the SVE feature is not specified. Reviewers: rengolin, fhahn, olista01, echristo, efriedma Reviewed By: fhahn Subscribers: sdardis, aemerson, javed.absar, tschuett, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40363 llvm-svn: 320992
* Reland "[mips] Fix the target specific instruction verifier"Simon Dardis2017-12-1826-35/+89
| | | | | | | | | | | | | Fix an off by one error in the bounds checking for 'dinsu' and update the ranges in the test comments so that they are accurate. This version has the correct commit message. Reviewers: atanasyan Differential Revision: https://reviews.llvm.org/D41183 llvm-svn: 320991
* [Memcpy Loop Lowering] Remove the fixed int8 lowering.Sean Fertile2017-12-182-79/+53
| | | | | | | | Switch over to the lowering that uses target supplied operand types. Differential Revision: https://reviews.llvm.org/D41201 llvm-svn: 320989
* [TableGen][AsmMatcherEmitter] Only choose specific diagnostic for enabled ↵Sander de Smalen2017-12-1829-218/+220
| | | | | | | | | | | | | | | | | | | | | | | instruction Summary: When emitting a diagnostic for an invalid operand, a specific diagnostic should only be reported when the instruction being matched is actually enabled by the feature flags. Patch [3/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. This patch fixes bogus diagnostic messages for when the SVE feature is not specified. Reviewers: rengolin, craig.topper, olista01, sdardis, stoklund Reviewed By: olista01, sdardis Subscribers: fhahn, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D40362 llvm-svn: 320986
* [LVI] Support for ashr in LVIMax Kazantsev2017-12-181-0/+27
| | | | | | | | | | Enhance LVI to analyze the ‘ashr’ binary operation. This leverages the infrastructure in ConstantRange for the ashr operation. Patch by Surya Kumari Jangala! Differential Revision: https://reviews.llvm.org/D40886 llvm-svn: 320983
* Revert "[mips] Fix the target specific instruction verifier"Simon Dardis2017-12-1826-89/+35
| | | | | | This reverts commit r320974. The commit message lacked the Differential Revison: line. llvm-svn: 320975
* [mips] Fix the target specific instruction verifierSimon Dardis2017-12-1826-35/+89
| | | | | | | | | | | Fix an off by one error in the bounds checking for 'dinsu' and update the ranges in the test comments so that they are accurate. Reviewers: atanasyan https://reviews.llvm.org/D41183 llvm-svn: 320974
* [AArch64][SVE] Asm: Add ZIP1/ZIP2 instructions (predicate/data vectors)Sander de Smalen2017-12-184-0/+294
| | | | | | | | | | | | | | Summary: Patch [2/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. Reviewers: rengolin, kristof.beyls, fhahn, mcrosier, evandro Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits, tschuett Differential Revision: https://reviews.llvm.org/D40361 llvm-svn: 320973
* [AArch64][SVE] Asm: Add SVE predicate register definitions and parsing supportSander de Smalen2017-12-182-0/+29
| | | | | | | | | | | | | | Summary: Patch [1/4] in a series to add parsing of predicates and properly parse SVE ZIP1/ZIP2 instructions. Reviewers: rengolin, kristof.beyls, fhahn, mcrosier, evandro, echristo, efriedma Reviewed By: fhahn Subscribers: aemerson, javed.absar, llvm-commits, tschuett Differential Revision: https://reviews.llvm.org/D40360 llvm-svn: 320970
* AArch64: work around how Cyclone handles "movi.2d vD, #0".Tim Northover2017-12-184-21/+30
| | | | | | | | | | | For Cylone, the instruction "movi.2d vD, #0" is executed incorrectly in some rare circumstances. Work around the issue conservatively by avoiding the instruction entirely. This patch changes CodeGen so that problematic instructions are never generated, and the AsmParser so that an equivalent instruction is used (with a warning). llvm-svn: 320965
* [TargetLibraryInfo] Discard library functions with incorrectly sized integersIgor Laevsky2017-12-181-0/+16
| | | | | | Differential Revision: https://reviews.llvm.org/D41184 llvm-svn: 320964
* [ARM] Adjust test checksSam Parker2017-12-181-2/+2
| | | | | | Correct the CHECK-LABELS of a couple of dag combine tests. llvm-svn: 320963
* [DAGCombine] Move AND nodes to multiple load leavesSam Parker2017-12-181-378/+328
| | | | | | | | | | | | | Search from AND nodes to find whether they can be propagated back to loads, so that the AND and load can be combined into a narrow load. We search through OR, XOR and other AND nodes and all bar one of the leaves are required to be loads or constants. The exception node then needs to be masked off meaning that the 'and' isn't removed, but the loads(s) are narrowed still. Differential Revision: https://reviews.llvm.org/D41177 llvm-svn: 320962
* [X86] Use mattr instead of mcpu in some of the cost model tests.Craig Topper2017-12-184-39/+33
| | | | | | | | Based on the names of the check lines, features seems more appropriate that cpu. Spotted while prototyping my patch to make 512-bit vectors illegal on SKX sometimes. llvm-svn: 320959
* [SROA] Disable non-whole-alloca splits by defaultHiroshi Inoue2017-12-183-59/+17
| | | | | | | This patch introduce a switch to control splitting of non-whole-alloca slices with default off. The switch will be default on again after fixing an issue reported in PR35657. llvm-svn: 320958
* [CGP] Fix the handling select inst in complex addressing modeSerguei Katkov2017-12-181-0/+21
| | | | | | | | | | | | | | | When we put the value in select placeholder we must pass the value through simplification tracker due to the value might be already simplified and erased. This is a fix for PR35658. Reviewers: john.brawn, uabelho Reviewed By: john.brawn Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41251 llvm-svn: 320956
* [x86] add tests for finite libcall lowering (PR35672); NFCSanjay Patel2017-12-181-0/+52
| | | | llvm-svn: 320955
* Re-commit "Properly handle multi-element and dynamically sized allocas in ↵Bjorn Steinbrink2017-12-171-0/+17
| | | | | | | | | getPointerDereferenceableBytes()"" llvm-clang-x86_64-expensive-checks-win is still broken, so the failure seems unrelated. llvm-svn: 320953
* [X86] Add test cases that show cases where buildvector of extract and ↵Craig Topper2017-12-171-0/+676
| | | | | | | | inserts should be turned into fmsubadd. This is a follow up to the fmaddsub support added in r320950. Hopefully in the future we can fix lowering to handle this fmsubadd too. llvm-svn: 320951
* [X86] Make the code that creates fmaddsub from build_vector of extracts and ↵Craig Topper2017-12-171-0/+269
| | | | | | | | | | | | | | | | | | | inserts functional and add tests. Summary: We had no tests for this and we couldn't do the optimization because of a bad use count check. We need to know how many non-undef pieces of the build vector were filled in and ensure our use count is equal to that. But on the shuffle combine version we need the use count to be 2. The missing coverage was noticed during the review of D40335. Reviewers: RKSimon, zvi, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41133 llvm-svn: 320950
* [X86] Regenerate truncated rotation tests + add missing 32-bit checksSimon Pilgrim2017-12-171-8/+41
| | | | llvm-svn: 320949
* Revert "Properly handle multi-element and dynamically sized allocas in ↵Bjorn Steinbrink2017-12-171-17/+0
| | | | | | | | | | getPointerDereferenceableBytes()" This reverts commit 217067d5179882de9deb60d2e866befea4c126e7. Fails on llvm-clang-x86_64-expensive-checks-win llvm-svn: 320945
* Properly handle byval arguments in getPointerDereferenceableBytes()Bjorn Steinbrink2017-12-171-1/+15
| | | | | | | | | | | | | | Summary: For byval arguments, the number of dereferenceable bytes is equal to the size of the pointee, not the pointer. Reviewers: hfinkel, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41305 llvm-svn: 320939
* Properly handle multi-element and dynamically sized allocas in ↵Bjorn Steinbrink2017-12-171-0/+17
| | | | | | | | | | | | getPointerDereferenceableBytes() Reviewers: hfinkel, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41288 llvm-svn: 320938
* [X86] Use extract_vector_elt instead of X86ISD::VEXTRACT for isel of vXi1 ↵Craig Topper2017-12-171-4/+4
| | | | | | extractions. llvm-svn: 320937
* [X86] Canonicalize extract_vector_elt from vXi1 to always return MVT::i32.Craig Topper2017-12-173-3686/+3686
| | | | | | This allows us to remove some isel patterns that allowed MVT::i8 result type. llvm-svn: 320936
* [X86][AVX] lowerVectorShuffleAsBroadcast - aggressively peek through BITCASTsSimon Pilgrim2017-12-162-113/+62
| | | | | | | | Assuming we can safely adjust the broadcast index for the new type to keep it suitably aligned, then peek through BITCASTs when looking for the broadcast source. Fixes PR32007 llvm-svn: 320933
* [X86][AVX] Fix failed broadcast foldSimon Pilgrim2017-12-161-16/+4
| | | | | | Strip excess BITCASTs from EXTRACT_SUBVECTOR input llvm-svn: 320930
* [Memcpy Loop Lowering] Only calculate residual size/bytes copied when needed.Sean Fertile2017-12-161-8/+4
| | | | | | | | If the loop operand type is int8 then there will be no residual loop for the unknown size expansion. Dont create the residual-size and bytes-copied values when they are not needed. llvm-svn: 320929
* [X86] When using vpopcntdq for ctpop of v8i16 vectors, only promote to v8i32.Craig Topper2017-12-162-18/+21
| | | | | | Previously we promoted to v8i64, but we don't need to go all the way to 512-bits. If we have VLX we can use the 256-bit instruction. And even if we don't have VLX we can widen v8i32 to v16i32 and drop the upper half. llvm-svn: 320926
* [InstCombine] Regenerate FMUL/FMA combine tests with update_test_checks.pySimon Pilgrim2017-12-162-100/+186
| | | | llvm-svn: 320922
* [InstCombine] canonicalize shifty abs(): ashr+add+xor --> cmp+neg+selSanjay Patel2017-12-161-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to do this for 2 reasons: 1. Value tracking does not recognize the ashr variant, so it would fail to match for cases like D39766. 2. DAGCombiner does better at producing optimal codegen when we have the cmp+sel pattern. More detail about what happens in the backend: 1. DAGCombiner has a generic transform for all targets to convert the scalar cmp+sel variant of abs into the shift variant. That is the opposite of this IR canonicalization. 2. DAGCombiner has a generic transform for all targets to convert the vector cmp+sel variant of abs into either an ABS node or the shift variant. That is again the opposite of this IR canonicalization. 3. DAGCombiner has a generic transform for all targets to convert the exact shift variants produced by #1 or #2 into an ISD::ABS node. Note: It would be an efficiency improvement if we had #1 go directly to an ABS node when that's legal/custom. 4. The pattern matching above is incomplete, so it is possible to escape the intended/optimal codegen in a variety of ways. a. For #2, the vector path is missing the case for setlt with a '1' constant. b. For #3, we are missing a match for commuted versions of the shift variants. 5. Therefore, this IR canonicalization can only help get us to the optimal codegen. The version of cmp+sel produced by this patch will be recognized in the DAG and converted to an ABS node when possible or the shift sequence when not. 6. In the following examples with this patch applied, we may get conditional moves rather than the shift produced by the generic DAGCombiner transforms. The conditional move is created using a target-specific decision for any given target. Whether it is optimal or not for a particular subtarget may be up for debate. define i32 @abs_shifty(i32 %x) { %signbit = ashr i32 %x, 31 %add = add i32 %signbit, %x %abs = xor i32 %signbit, %add ret i32 %abs } define i32 @abs_cmpsubsel(i32 %x) { %cmp = icmp slt i32 %x, zeroinitializer %sub = sub i32 zeroinitializer, %x %abs = select i1 %cmp, i32 %sub, i32 %x ret i32 %abs } define <4 x i32> @abs_shifty_vec(<4 x i32> %x) { %signbit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31> %add = add <4 x i32> %signbit, %x %abs = xor <4 x i32> %signbit, %add ret <4 x i32> %abs } define <4 x i32> @abs_cmpsubsel_vec(<4 x i32> %x) { %cmp = icmp slt <4 x i32> %x, zeroinitializer %sub = sub <4 x i32> zeroinitializer, %x %abs = select <4 x i1> %cmp, <4 x i32> %sub, <4 x i32> %x ret <4 x i32> %abs } > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=x86_64 -mattr=avx > abs_shifty: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_cmpsubsel: > movl %edi, %eax > negl %eax > cmovll %edi, %eax > retq > > abs_shifty_vec: > vpabsd %xmm0, %xmm0 > retq > > abs_cmpsubsel_vec: > vpabsd %xmm0, %xmm0 > retq > > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=aarch64 > abs_shifty: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_cmpsubsel: > cmp w0, #0 // =0 > cneg w0, w0, mi > ret > > abs_shifty_vec: > abs v0.4s, v0.4s > ret > > abs_cmpsubsel_vec: > abs v0.4s, v0.4s > ret > > $ ./opt -instcombine shiftyabs.ll -S | ./llc -o - -mtriple=powerpc64le > abs_shifty: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_cmpsubsel: > srawi 4, 3, 31 > add 3, 3, 4 > xor 3, 3, 4 > blr > > abs_shifty_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > > abs_cmpsubsel_vec: > vspltisw 3, -16 > vspltisw 4, 15 > vsubuwm 3, 4, 3 > vsraw 3, 2, 3 > vadduwm 2, 2, 3 > xxlxor 34, 34, 35 > blr > Differential Revision: https://reviews.llvm.org/D40984 llvm-svn: 320921
* Move Transforms/LoopVectorize/consecutive-ptr-cg-bug.ll into the X86 ↵Hal Finkel2017-12-161-0/+0
| | | | | | | | subdirectory This test depends on X86's TTI; move into the X86 subdirectory. llvm-svn: 320914
* [LV] Extend InstWidening with CM_Widen_RecursiveHal Finkel2017-12-161-0/+68
| | | | | | | | | | | | | | | | | | | | Changes to the original scalar loop during LV code gen cause the return value of Legal->isConsecutivePtr() to be inconsistent with the return value during legal/cost phases (further analysis and information of the bug is in D39346). This patch is an alternative fix to PR34965 following the CM_Widen approach proposed by Ayal and Gil in D39346. It extends InstWidening enum with CM_Widen_Reverse to properly record the widening decision for consecutive reverse memory accesses and, consequently, get rid of the Legal->isConsetuviePtr() call in LV code gen. I think this is a simpler/cleaner solution to PR34965 than the one in D39346. Fixes PR34965. Patch by Diego Caballero, thanks! Differential Revision: https://reviews.llvm.org/D40742 llvm-svn: 320913
* [PowerPC, AsmParser] Enable the mnemonic spell correctorHal Finkel2017-12-161-0/+44
| | | | | | | | | | | r307148 added an assembly mnemonic spelling correction support and enabled it on ARM. This enables that support on PowerPC as well. Patch by Dmitry Venikov, thanks! Differential Revision: https://reviews.llvm.org/D40552 llvm-svn: 320911
* [X86] Add 128 and 256-bit VPOPCNTDQ instructions. Adjust some tablegen ↵Craig Topper2017-12-164-0/+291
| | | | | | | | classes LZCNT/POPCNT. I think when this instruction was first published it was only for a Knights CPU and thus VLX version was missing. llvm-svn: 320910
* [LTO] Update tests for r320905Vitaly Buka2017-12-162-2/+2
| | | | llvm-svn: 320909
* Remove trailing whitespaceVitaly Buka2017-12-161-1/+1
| | | | llvm-svn: 320907
* [LTO] Make processing of combined module more consistentVitaly Buka2017-12-1621-25/+57
| | | | | | | | | | | | | | | | | Summary: 1. Use stream 0 only for combined module. Previously if combined module was not processes ThinLTO used the stream for own output. However small changes in input, could trigger combined module and shuffle outputs making life of llvm::LTO harder. 2. Always process combined module and write output to stream 0. Processing empty combined module is cheap and allows llvm::LTO users to avoid implementing processing which is already done in llvm::LTO. Subscribers: mehdi_amini, inglorion, eraman, hiraditya Differential Revision: https://reviews.llvm.org/D41267 llvm-svn: 320905
* Add another missing -enable-import-metadata to testTeresa Johnson2017-12-161-1/+1
| | | | | | | | r320895 modified a test so that it needs -enable-import-metadata which is false by default for NDEBUG, found another place that needs this added. llvm-svn: 320903
* [SimplifyLibCalls] Inline calls to cabs when it's safe to do soHal Finkel2017-12-162-0/+124
| | | | | | | | | | | | When unsafe algerbra is allowed calls to cabs(r) can be replaced by: sqrt(creal(r)*creal(r) + cimag(r)*cimag(r)) Patch by Paul Walker, thanks! Differential Revision: https://reviews.llvm.org/D40069 llvm-svn: 320901
* Add -enable-import-metadata to testTeresa Johnson2017-12-161-1/+1
| | | | | | | r320895 modified a test so that it needs -enable-import-metadata which is false by default for NDEBUG. llvm-svn: 320899
* [ThinLTO] Enable importing of aliases as copy of aliaseeTeresa Johnson2017-12-167-51/+89
| | | | | | | | | | | | | | | | | | | | | | | Summary: This implements a missing feature to allow importing of aliases, which was previously disabled because alias cannot be available_externally. We instead import an alias as a copy of its aliasee. Some additional work was required in the IndexBitcodeWriter for the distributed build case, to ensure that the aliasee has a value id in the distributed index file (i.e. even when it is not being imported directly). This is a performance win in codes that have many aliases, e.g. C++ applications that have many constructor and destructor aliases. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D40747 llvm-svn: 320895
* Revert "Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header.""Paul Robinson2017-12-151-10/+11
| | | | | | | This reverts commit 0afef672f63f0e4e91938656bc73424a8c058bfc. Still failing at runtime on bots. llvm-svn: 320888
* Recommit "[DWARFv5] Dump an MD5 checksum in the line-table header."Paul Robinson2017-12-151-11/+10
| | | | | | | | | | | Adds missing support for DW_FORM_data16. Update of r320852, fixing the unittest to use a hand-coded struct instead of std::array to guarantee data layout. Differential Revision: https://reviews.llvm.org/D41090 llvm-svn: 320886
* [Hexagon] Handle concat_vectors of all allowed HVX typesKrzysztof Parzyszek2017-12-153-0/+82
| | | | llvm-svn: 320865
* Re-commit : [LICM] Allow sinking when foldable in loopJun Bum Lim2017-12-151-0/+150
| | | | | | | | | | | | | | | | | | | | | | This recommits r320823 reverted due to the test failure in sink-foldable.ll and an unused variable. Added "REQUIRES: aarch64-registered-target" in the test and removed unused variable. Original commit message: Continue trying to sink an instruction if its users in the loop is foldable. This will allow the instruction to be folded in the loop by decoupling it from the user outside of the loop. Reviewers: hfinkel, majnemer, davidxl, efriedma, danielcdh, bmakam, mcrosier Reviewed By: hfinkel Subscribers: javed.absar, bmakam, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D37076 llvm-svn: 320858
* Revert "[DWARFv5] Dump an MD5 checksum in the line-table header."Paul Robinson2017-12-151-10/+11
| | | | | | Unit test fails on some bots. llvm-svn: 320857
* [Hexagon] Fix operand-swapping PatFrag for atomic storesKrzysztof Parzyszek2017-12-151-0/+68
| | | | | | | PatFrag now has the atomicity information stored as bit fields. They need to be copied to the new PatFrag. llvm-svn: 320855
* [DWARFv5] Dump an MD5 checksum in the line-table header.Paul Robinson2017-12-151-11/+10
| | | | | | | | Adds missing support for DW_FORM_data16. Differential Revision: https://reviews.llvm.org/D41090 llvm-svn: 320852
OpenPOWER on IntegriCloud