summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* [ConstHoisting] choose to hoist when frequency is the same.Wei Mi2017-07-061-4/+48
| | | | | | | | | | | | The patch is to adjust the strategy of frequency based consthoisting: Previously when the candidate block has the same frequency with the existing blocks containing a const, it will not hoist the const to the candidate block. For that case, now we change the strategy to hoist the const if only existing blocks have more than one block member. This is helpful for reducing code size. Differential Revision: https://reviews.llvm.org/D35084 llvm-svn: 307328
* [X86][SSE] Tests for bitcasting iX integers to vXi1 boolean vectorsSimon Pilgrim2017-07-063-0/+7447
| | | | | | Including sign/zero extension to legal types llvm-svn: 307301
* [X86][SSE] Dropped -mcpu from bitcast+setcc testsSimon Pilgrim2017-07-062-100/+475
| | | | | | | | Use triple and attribute only for consistency Added SSE2/AVX tests on 256-bit vectors to test PACKSS behaviour llvm-svn: 307289
* [LSR] Narrow search space by filtering non-optimal formulae with the same ↵Wei Mi2017-07-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ScaledReg and Scale. When the formulae search space is huge, LSR uses a series of heuristic to keep pruning the search space until the number of possible solutions are within certain limit. The big hammer of the series of heuristics is NarrowSearchSpaceByPickingWinnerRegs, which picks the register which is used by the most LSRUses and deletes the other formulae which don't use the register. This is a effective way to prune the search space, but quite often not a good way to keep the best solution. We saw cases before that the heuristic pruned the best formula candidate out of search space. To relieve the problem, we introduce a new heuristic called NarrowSearchSpaceByFilterFormulaWithSameScaledReg. The basic idea is in order to reduce the search space while keeping the best formula, we want to keep as many formulae with different Scale and ScaledReg as possible. That is because the central idea of LSR is to choose a group of loop induction variables and use those induction variables to represent LSRUses. An induction variable candidate is often represented by the Scale and ScaledReg in a formula. If we have more formulae with different ScaledReg and Scale to choose, we have better opportunity to find the best solution. That is why we believe pruning search space by only keeping the best formula with the same Scale and ScaledReg should be more effective than PickingWinnerReg. And we use two criteria to choose the best formula with the same Scale and ScaledReg. The first criteria is to select the formula using less non shared registers, and the second criteria is to select the formula with less cost got from RateFormula. The patch implements the heuristic before NarrowSearchSpaceByPickingWinnerRegs, which is the last resort. Testing shows we get 1.8% and 2% on two internal benchmarks on x86. llvm nightly testsuite performance is neutral. We also tried lsr-exp-narrow and it didn't help on the two improved internal cases we saw. Differential Revision: https://reviews.llvm.org/D34583 llvm-svn: 307269
* [X86][SSE4A] Add support for shuffle combining to INSERTQI.Simon Pilgrim2017-07-061-13/+4
| | | | llvm-svn: 307268
* [X86][SSE4A] Add test showing missed opportunities to combine INSERTQI shuffleSimon Pilgrim2017-07-061-0/+20
| | | | llvm-svn: 307265
* [x86] fix over-specified triple and auto-generate checks; NFCSanjay Patel2017-07-061-17/+25
| | | | llvm-svn: 307262
* [X86][SSE4A] Add support for shuffle combining to EXTRQ.Simon Pilgrim2017-07-061-24/+12
| | | | llvm-svn: 307254
* [X86][SSE4A] Add scheduling tests for SSE4A instructionsSimon Pilgrim2017-07-061-0/+95
| | | | llvm-svn: 307251
* {DAGCombiner] Fold (rot x, 0) -> xSimon Pilgrim2017-07-051-1/+0
| | | | llvm-svn: 307184
* [X86] Test bitfield loadstore tests on i686 as wellSimon Pilgrim2017-07-051-89/+162
| | | | llvm-svn: 307182
* [DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions ↵Andrew Zhogin2017-07-051-4/+2
| | | | | | | | | | into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179
* [X86][SSE] Dropped -mcpu from bitcast+setcc mask testsSimon Pilgrim2017-07-052-130/+130
| | | | | | Use triple and attribute only for consistency llvm-svn: 307176
* [GlobalIsel] allow x86_fp80 values to be dumped.Igor Breger2017-07-051-0/+18
| | | | | | | | | | | | | | | | Summary: Otherwise the fallback path fails with an assertion on x86_64 targets, when "x86_fp80" is encountered. Reviewers: t.p.northover, zvi, guyblank Reviewed By: zvi Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34975 llvm-svn: 307140
* Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffsetNirav Dave2017-07-056-177/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | Relanding after rewriting undef.ll test to avoid host-dependant endianness. As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 307114
* NFC.Gadi Haber2017-07-041-210/+835
| | | | | | | | | | | | | | | Made some updates to the half.ll test under CodeGen to make it friendly to the update_llc_test_checks .py tool as follows: 1.Removing the llc flag -asm-verbose=false 2.Grouping the multiple check-prefix directives 3.Apply update_llc_test_checks.py tool on the test This change is needed to easily update scheduling changes in an upcoming patch. Reviewers: zvi, RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D34934 llvm-svn: 307108
* [X86][SSE4A] Add support for combining from non-v16i8 EXTRQI/INSERTQI shufflesSimon Pilgrim2017-07-042-36/+41
| | | | | | With the improved shuffle decoding we can now combine EXTRQI/INSERTQI shuffles from non-v16i8 vector types llvm-svn: 307099
* [FastISel] Move gc intrinsic test to X86 directoryAnna Thomas2017-07-041-0/+57
| | | | | | | | | Move from generic to X86 directory since gc intrinsics only supposed in X86 64 bit. Add target triple as well. Fixes build failure in i686-linux-RA caused by rL307084. llvm-svn: 307086
* [X86] Add combine tests for vector rotatesSimon Pilgrim2017-07-041-0/+83
| | | | | | Reference tests for D12833 llvm-svn: 307073
* NFC commit.Gadi Haber2017-07-041-25/+26
| | | | | | | | | | | | Converting the Codegen test "extractelement-legalization-store-ordering.ll" to be "update_llc_test_checks" friendly. The changes to the test are needed for an upcoming scheduling patch. Reviewers: zvi, RKSimon Differential Revision: https://reviews.llvm.org/D34935 llvm-svn: 307066
* [X86] Add comment string for broadcast loads from the constant pool.Craig Topper2017-07-046-639/+1459
| | | | | | | | | | | | | | | | | Summary: When broadcasting from the constant pool its useful to print out the final vector similar to what we do for normal moves from the constant pool. I changed only a couple tests that were broadcast focused. One of them had been previously hand tweaked after running the script so that it could check the constant pool declaration. But I think this patch makes that unnecessary now since we can check the comment instead. Reviewers: spatel, RKSimon, zvi Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34923 llvm-svn: 307062
* [legalize-types] Clean up softening machinery.Anton Yartsev2017-07-041-0/+55
| | | | | | | | The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265. Differential Revision: https://reviews.llvm.org/D31946 llvm-svn: 307053
* [X86][SSE4A] Add support for combining from EXTRQI/INSERTQI shufflesSimon Pilgrim2017-07-032-18/+9
| | | | llvm-svn: 307048
* [X86][SSE4A] Add SSE4A shuffle tests on pre-SSSE3 hardwareSimon Pilgrim2017-07-031-0/+71
| | | | llvm-svn: 307042
* [X86][SSE4A] Test SSE4A shuffle combining on SSE42 capable target as wellSimon Pilgrim2017-07-031-17/+36
| | | | llvm-svn: 307038
* DAGCombine: Combine BUILD_VECTOR to TRUNCATEZvi Rackover2017-07-033-557/+188
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-034-155/+162
| | | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. There were also over-specifications in the RUN params such as CPU model. llvm-svn: 307033
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-034-219/+539
| | | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. There were also several over-specifications in the RUN params such as CPU model or OS requirement llvm-svn: 307028
* [X86][SSE4A] Add tests showing missed opportunities to combine ↵Simon Pilgrim2017-07-031-0/+80
| | | | | | EXTRQI/INSERTQI shuffles llvm-svn: 307027
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-034-337/+337
| | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. llvm-svn: 307024
* [GlobalISel][X86] fix %ptr(p0) = G_CONSTANT selection.Igor Breger2017-07-032-0/+40
| | | | llvm-svn: 307019
* [X86][AVX512] Test AVX512VPOPCNTDQ CTPOP with/without AVX512BWSimon Pilgrim2017-07-021-29/+57
| | | | llvm-svn: 306991
* [X86][AVX512VPOPCNTDQ] Improve support for v16i8/v8i16/v16i16/ CTPOPSimon Pilgrim2017-07-026-156/+111
| | | | | | Zero extend to v16i32/v8i64, use VPOPCNTDQ instructions and truncate back. llvm-svn: 306990
* [X86][AVX512] Cleanup tzcnt tests triples and attributesSimon Pilgrim2017-07-021-36/+36
| | | | | | Avoid use of specific -mcpu llvm-svn: 306989
* [X86][AVX512] Cleanup popcnt tests triples and attributesSimon Pilgrim2017-07-021-15/+15
| | | | | | Avoid use of specific -mcpu llvm-svn: 306988
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-024-72/+126
| | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. llvm-svn: 306984
* [x86] remove unnecessary RUN for test after auto-generating checks; NFCSanjay Patel2017-07-021-5/+21
| | | | llvm-svn: 306983
* [x86] update test to use FileCheck and auto-generate checks; NFCSanjay Patel2017-07-021-1/+50
| | | | llvm-svn: 306982
* [x86] auto-generate complete checks for tests; NFCSanjay Patel2017-07-024-32/+41
| | | | | | These all used 'CHECK-NOT' which isn't necessary if we have complete checks. llvm-svn: 306981
* [X86][SSE] Attempt to combine 64-bit and 32-bit shuffles to unary shuffles ↵Simon Pilgrim2017-07-022-2/+2
| | | | | | | | before bit shifts We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive llvm-svn: 306978
* [X86][SSE] Attempt to combine 64-bit and 16-bit shuffles to unary shuffles ↵Simon Pilgrim2017-07-021-5/+2
| | | | | | | | | | before bit shifts We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive The 32-bit shuffles are a bit tricky and will be dealt with in a later patch llvm-svn: 306977
* [X86][SSE] Add test showing missed opportunity to combine to pshuflwSimon Pilgrim2017-07-021-0/+18
| | | | | | We are combining shuffles to bit shifts before unary permutes, which means we can't fold loads plus the destination register is destructive llvm-svn: 306976
* [X86] Rerun "update_llc_test_checks" tool on CodeGen tests. NFC.Gadi Haber2017-07-023-0/+75
| | | | | | | | | | This is NFC after rerunning the "update_llc_test_checks.py" tool on the CodeGen X86 tests in order to submit a patch. Minor differences due to added "End of Function" lines. Reviewers: zvi Differential Revision: https://reviews.llvm.org/D34933 llvm-svn: 306973
* [GlobalISel][X86] Support G_GLOBAL_VALUE operation.Igor Breger2017-07-024-0/+220
| | | | | | | | | | | | | | | | | Summary: Support G_GLOBAL_VALUE operation. For now most of the PIC configurations not implemented yet. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34738 Conflicts: test/CodeGen/X86/GlobalISel/regbankselect-X86_64.mir llvm-svn: 306972
* [GlobalISel][X86] Support vector type G_UNMERGE_VALUES selection.Igor Breger2017-07-023-17/+283
| | | | | | | | | | | | | | | | Summary: Support vector type G_UNMERGE_VALUES selection. For now G_UNMERGE_VALUES marked as legal for any type, so nothing to do in legalizer. Reviewers: t.p.northover, qcolombet, zvi, guyblank Reviewed By: guyblank Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33665 llvm-svn: 306971
* fix trivial typos; NFCHiroshi Inoue2017-07-021-1/+1
| | | | | | suport -> support llvm-svn: 306968
* [X86][RDSEED] Split off i64 intrinsic tests and test i16/i32 on 32-bit ↵Simon Pilgrim2017-07-012-29/+56
| | | | | | target as well. llvm-svn: 306961
* [X86][RDRAND] Split off i64 intrinsic tests and test i16/i32 on 32-bit ↵Simon Pilgrim2017-07-012-36/+102
| | | | | | target as well. llvm-svn: 306960
* [X86] Removed reference to update_test_checks.pySimon Pilgrim2017-07-011-1/+1
| | | | llvm-svn: 306959
* [X86][AVX] Remove duplicate autogeneration noteSimon Pilgrim2017-07-011-3/+2
| | | | llvm-svn: 306958
OpenPOWER on IntegriCloud