summaryrefslogtreecommitdiffstats
path: root/llvm/test/CodeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* add skylakeClement Courbet2017-04-211-2/+3
| | | | llvm-svn: 300962
* add 32 bit testsClement Courbet2017-04-211-8/+10
| | | | llvm-svn: 300961
* use repmovsb when optimizing forminsizeClement Courbet2017-04-211-0/+26
| | | | llvm-svn: 300960
* Rename FastString flag.Clement Courbet2017-04-211-2/+2
| | | | llvm-svn: 300959
* add more testsClement Courbet2017-04-211-0/+4
| | | | llvm-svn: 300958
* X86 memcpy: use REPMOVSB instead of REPMOVS{Q,D,W} for inline copiesClement Courbet2017-04-211-0/+15
| | | | | | | | | | | | when the subtarget has fast strings. This has two advantages: - Speed is improved. For example, on Haswell thoughput improvements increase linearly with size from 256 to 512 bytes, after which they plateau: (e.g. 1% for 260 bytes, 25% for 400 bytes, 40% for 508 bytes). - Code is much smaller (no need to handle boundaries). llvm-svn: 300957
* Temporarily revert r299221 to fix nondeterminism in ThinLTO builder.Galina Kistanova2017-04-191-11/+17
| | | | llvm-svn: 300783
* X86FrameLowering: Fix getFrameIndexReference() for 'fixed' objectsMatthias Braun2017-04-191-0/+75
| | | | | | | | | | | Debug information is calculated with getFrameIndexReference() which was missing some logic for the fixed object cases (= parameters on the stack). rdar://24557797 Differential Revision: https://reviews.llvm.org/D32204 llvm-svn: 300781
* [DAG] add splat vector support for 'or' in SimplifyDemandedBitsSanjay Patel2017-04-192-19/+15
| | | | | | | | | | | I've changed one of the tests to not fold away, but we didn't and still don't do the transform that the comment claims we do (and I don't know why we'd want to do that). Follow-up to: https://reviews.llvm.org/rL300725 https://reviews.llvm.org/rL300763 llvm-svn: 300772
* [DAG] add splat vector support for 'xor' in SimplifyDemandedBitsSanjay Patel2017-04-193-37/+32
| | | | | | | | | This allows forming more 'not' ops, so we get improvements for ISAs that have and-not. Follow-up to: https://reviews.llvm.org/rL300725 llvm-svn: 300763
* Update the madd.ll test with utils/update_llc_test_checks.py (NFC)Dehao Chen2017-04-191-48/+264
| | | | llvm-svn: 300740
* PR32710: Disable using PMADDWD for unsigned short.Dehao Chen2017-04-191-5/+55
| | | | | | | | | | | | | | Summary: PMADDWD can only handle signed short. Reviewers: mkuper, wmi Reviewed By: mkuper Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D32236 llvm-svn: 300737
* [DAG] add splat vector support for 'and' in SimplifyDemandedBitsSanjay Patel2017-04-193-29/+7
| | | | | | | | | | | | | | | | | | | | | The patch itself is simple: stop discriminating against vectors in visitAnd() and again in SimplifyDemandedBits(). Some notes for reference: 1. We're not consistent about calls to SimplifyDemandedBits in the various visitXXX functions. Sometimes, we check if the RHS is a constant first. Other times (like here), we just dive in. 2. I'd like to break the vector shackles in steps for the sake of risk minimization, but we could make similar simultaneous changes in other places if we think that would be better. 3. I don't know what the intent of the changed tests in this patch was supposed to be, but since they wiggled in a positive way, I'm just going with that. :) 4. In the rotate tests, note that we can see through non-splat constants. This is a result of D24253. 5. My motivation for being here now is to make D31944 look better, so this is step 1 of N towards improving the vector codegen in that patch without writing any actual new code. Differential Revision: https://reviews.llvm.org/D32230 llvm-svn: 300725
* [GlobalIsel][X86] support G_TRUNC selection.Igor Breger2017-04-194-0/+299
| | | | | | | | | | | | | | | | Summary: [GlobalIsel][X86] support G_TRUNC selection. Add regbank-select and legalizer tests. Currently legalization of trunc i64 on 32bit platform not supported. Reviewers: ab, zvi, rovka Reviewed By: zvi Subscribers: dberris, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D32115 llvm-svn: 300678
* [X86] Add D32039/PR31357 tests to show current BSWAP codegenSimon Pilgrim2017-04-192-0/+255
| | | | llvm-svn: 300672
* [X86][SSE] Add scheduling latency/throughput tests for (most) SSE2 instructionsSimon Pilgrim2017-04-191-0/+6039
| | | | llvm-svn: 300671
* [GlobalISel][X86] Split select tests. NFC.Igor Breger2017-04-197-444/+455
| | | | llvm-svn: 300666
* [x86] add tests for potential andn optimization; NFCSanjay Patel2017-04-181-2/+40
| | | | llvm-svn: 300617
* [X86] Keep EXTRACT_VECTOR_ELT result type as f128 for Android x86_64.Chih-Hung Hsieh2017-04-182-0/+59
| | | | | | | | | | Android x86_64 target uses f128 type and stores f128 values in %xmm* registers. SoftenFloatRes_EXTRACT_VECTOR_ELT should not convert result value from f128 to i128. Differential Revision: http://reviews.llvm.org/D32102 llvm-svn: 300583
* [X86][SSE] Add scheduling latency/throughput tests for (most) SSE1 instructionsSimon Pilgrim2017-04-181-0/+2415
| | | | llvm-svn: 300576
* [DAG] Improve store merge candidate pruning.Nirav Dave2017-04-181-9/+3
| | | | | | | | | | | | | | Remove non-consecutive stores from store merge candidate search as they cannot be merged and will prevent us from finding subsequent mergeable store cases. Reviewers: jyknight, bogner, javed.absar, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D32086 llvm-svn: 300561
* Add base-index-based store merge testNirav Dave2017-04-181-0/+31
| | | | llvm-svn: 300559
* Add store Merge test.Nirav Dave2017-04-181-0/+25
| | | | llvm-svn: 300551
* Change the testcase tail-merge-after-mbp.ll to tail-merge-after-mbp.mirHaicheng Wu2017-04-172-94/+105
| | | | | | Differential Revision: https://reviews.llvm.org/D32037 llvm-svn: 300506
* [X86] Remove special handling for 16 bit for A asm constraints.Benjamin Kramer2017-04-161-1/+8
| | | | | | | | | | Our 16 bit support is assembler-only + the terrible hack that is .code16gcc. Simply using 32 bit registers does the right thing for the latter. Fixes PR32681. llvm-svn: 300429
* Use correct registers for "A" inline asm constraintDimitry Andric2017-04-151-0/+35
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In PR32594, inline assembly using the 'A' constraint on x86_64 causes llvm to crash with a "Cannot select" stack trace. This is because `X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A' means the EAX and EDX registers. However, on x86_64 it means the RAX and RDX registers, and on 16-bit x86 (ia16?) it means the old AX and DX registers. Add new register classes in `X86RegisterInfo.td` to support these cases, and amend the logic in `getRegForInlineAsmConstraint` to cope with different subtargets. Also add a test case, derived from PR32594. Reviewers: craig.topper, qcolombet, RKSimon, ab Reviewed By: ab Subscribers: ab, emaste, royger, llvm-commits Differential Revision: https://reviews.llvm.org/D31902 llvm-svn: 300404
* [X86][SSE] Update MOVNTDQA non-temporal loads to generic implementation (LLVM)Simon Pilgrim2017-04-144-33/+51
| | | | | | | | | | MOVNTDQA non-temporal aligned vector loads can be correctly represented using generic builtin loads, allowing us to remove the existing x86 intrinsics. Clang companion patch: D31766. Differential Revision: https://reviews.llvm.org/D31767 llvm-svn: 300325
* Fix for PR#30562: Selection DAG error: Detected cycle in SelectionDAG.Andrew V. Tischenko2017-04-141-0/+22
| | | | | | Patch by Dinar Temirbulatov llvm-svn: 300314
* This patch closes PR#32216: Better testing of schedule model instruction ↵Andrew V. Tischenko2017-04-142-595/+631
| | | | | | | | latencies/throughputs. The details are here: https://reviews.llvm.org/D30941 llvm-svn: 300311
* [GlobalIsel][X86] support G_CONSTANT selection.Igor Breger2017-04-123-0/+224
| | | | | | | | | | | | | | Summary: [GlobalISel][X86] support G_CONSTANT selection. Add regbank select tests. Reviewers: zvi, guyblank Reviewed By: guyblank Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31974 llvm-svn: 300057
* [DAGCombine] Add more test cases for shuffle of splat. NFC.Zvi Rackover2017-04-111-0/+56
| | | | | | Tests added contain splat-masks with undef elements. llvm-svn: 299988
* [x86] Relax the check in areLoadsFromSameBasePtrEaswaran Raman2017-04-111-4/+4
| | | | | | | | | Check if the scale operand is identical (doesn't have to be 1) and do not check the chaain operand. Differential revision: https://reviews.llvm.org/D31833 llvm-svn: 299986
* [X86] Create the correct ADC/SBB SDNode when lowering add.Davide Italiano2017-04-111-0/+27
| | | | | | Differential Revision: https://reviews.llvm.org/D31911 llvm-svn: 299973
* [ARM, x86] add tests to show possible improvement for bool math; NFCSanjay Patel2017-04-101-0/+32
| | | | llvm-svn: 299897
* CodeGen: BlockPlacement: Don't always tail-duplicate with no other successor.Kyle Butt2017-04-102-2/+55
| | | | | | | | | | The math works out where it can actually be counter-productive. The probability calculations correctly handle the case where the alternative is 0 probability, rely on those calculations. Includes a test case that demonstrates the problem. llvm-svn: 299892
* CodeGen: BlockPlacement: Minor probability changes.Kyle Butt2017-04-101-0/+41
| | | | | | | Qin may be large, and Succ may be more frequent than BB. Take these both into account when deciding if tail-duplication is profitable. llvm-svn: 299891
* CodeGen: BranchFolding: Merge identical blocks, even if they are short.Kyle Butt2017-04-101-0/+41
| | | | | | | | Merging identical blocks when it doesn't reduce fallthrough. It is common for the blocks created from critical edge splitting to be identical. We would like to merge these blocks whenever doing so would not reduce fallthrough. llvm-svn: 299890
* Add address space mangling to lifetime intrinsicsMatt Arsenault2017-04-1020-188/+188
| | | | | | In preparation for allowing allocas to have non-0 addrspace. llvm-svn: 299876
* [X86][MMX] Add fast-isel support for MMX non-temporal writesSimon Pilgrim2017-04-101-1/+1
| | | | | | Differential Revision: https://reviews.llvm.org/D31754 llvm-svn: 299852
* Use PMADDWD to expand reduction in a loopDehao Chen2017-04-071-0/+103
| | | | | | | | | | | | | | | | | | Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 llvm-svn: 299776
* [GlobalISel] implement narrowing for G_CONSTANT.Igor Breger2017-04-071-0/+43
| | | | | | | | | | | | | | Summary: [GlobalISel] implement narrowing for G_CONSTANT. Reviewers: bogner, zvi, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, dberris, rovka, kristof.beyls Differential Revision: https://reviews.llvm.org/D31744 llvm-svn: 299772
* Turn on -addr-sink-using-gep by default.Eli Friedman2017-04-066-34/+12
| | | | | | | | | The new codepath has been in the tree for years, and there isn't any reason to use two codepaths here. Differential Revision: https://reviews.llvm.org/D30596 llvm-svn: 299723
* [X86] Revert r299387 due to AVX legalization infinite loop.Michael Kuperstein2017-04-0611-87/+102
| | | | llvm-svn: 299720
* [X86][MMX] Test showing failure to create MMX non-temporal storeSimon Pilgrim2017-04-061-7/+26
| | | | llvm-svn: 299640
* [DAGCombiner] add and use TLI hook to convert and-of-seteq / or-of-setne to ↵Sanjay Patel2017-04-052-11/+9
| | | | | | | | | | | | | | | | | bitwise logic+setcc (PR32401) This is a generic combine enabled via target hook to reduce icmp logic as discussed in: https://bugs.llvm.org/show_bug.cgi?id=32401 It's likely that other targets will want to enable this hook for scalar transforms, and there are probably other patterns that can use bitwise logic to reduce comparisons. Note that we are missing an IR canonicalization for these patterns, and we will probably prefer the pair-of-compares form in IR (shorter, more likely to fold). Differential Revision: https://reviews.llvm.org/D31483 llvm-svn: 299542
* [X86] Relax assert in broadcast-of-subvector lowering.Ahmed Bougacha2017-04-052-0/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before r294774, there was a problem when lowering broadcasts to use 128-bit subvectors. When we looked through a bitcast to find the broadcast input, we'd keep using the original type, so you'd end up with things like: (v8f32 (broadcast (v4f32 (extract_subvector (v8i32 V), ...)) )) r294774 fixed it to always emit subvectors with the scalar type of the original source. It also introduced some asserts, to check that we use scalars with the same size, and vectors with the same number of elements. The scalar size equality is checked earlier when looking through bitcasts, and is a useful assert. However, the number of elements don't have to be identical: we're always going to extract a 128-bit subvector, and we can have different size inputs if we looked through a concat_vector to find a 256-bit source. Relax the overzealous assert. Replace it with a check of the original source vector being 256 or 512 bits. If it's 128 bits, we can't extract_subvector from it. Fixes PR32371. llvm-svn: 299490
* Change section flag character for SHF_LINK_ORDER to "o".Evgeniy Stepanov2017-04-041-8/+8
| | | | | | | | GAS uses "m" as a compatibility alias for "M" (SHF_MERGE). "o" is free, except on ia64, where it already means SHF_LINK_ORDER. llvm-svn: 299479
* [X86][LLVM] Converting __mm{|256|512}_movm_epi{8|16|32|64} LLVMIR call into ↵Michael Zuckerman2017-04-048-158/+157
| | | | | | | | | | | generic intrinsics. This patch is a part one of two reviews, one for the clang and the other for LLVM. The patch deletes the back-end intrinsics and adds support for them in the auto upgrade. Differential Revision: https://reviews.llvm.org/D31393 llvm-svn: 299432
* [X86] Add 64 bit pattern matching for PSADBWOren Ben Simhon2017-04-041-0/+347
| | | | | | | | | PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 llvm-svn: 299425
* add/move codegen tests for and/or of setcc; NFCSanjay Patel2017-04-031-2/+32
| | | | llvm-svn: 299396
OpenPOWER on IntegriCloud