summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE41] Combine insertion of zero scalars into vector blends with zeroSimon Pilgrim2016-02-241-0/+14
| | | | | | | | | | Part 1 of 2 This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets). (Part 2 will add support for combining multiple blends-with-zero). Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261743
* [CodeView] Describe variables live in x87 registersDavid Majnemer2016-02-241-0/+5
| | | | | | | We didn't have a mapping from LLVM's x87 floating point registers to CodeView's encoding. llvm-svn: 261730
* [X86][SSE] Don't get target shuffle operands prematurely. Simon Pilgrim2016-02-241-4/+7
| | | | | | PerformShuffleCombine should be usable by unary and binary target shuffles, but was attempting to get the first two operands whatever the instruction type. Since these are only used for VECTOR_SHUFFLE instructions for one particular combine I've moved them inside the relevant if statement. llvm-svn: 261727
* [LLVM][AVX512][PSHUFHW ][PSHUFLW ] Change imm8 to intMichael Zuckerman2016-02-241-6/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D17538 llvm-svn: 261725
* AVX512: Add vpmovzxbw/d/q ,vpmovzxw/d/q ,vpmovzxbdq lowering patterns that ↵Igor Breger2016-02-241-0/+14
| | | | | | | | support 256bit inputs like AVX patterns ( that are disable in case HasVLX , see SS41I_pmovx_avx2_patterns). Differential Revision: http://reviews.llvm.org/D17504 llvm-svn: 261724
* X86: Wrap a helper for an assert in #ifndef NDEBUGJustin Bogner2016-02-241-11/+7
| | | | | | | | | | | This function is used in exactly one place, and only in asserts builds. Move it a few lines up before the use and only define it when asserts are enabled. Fixes the release build under -Werror. Also remove the forward declaration and commentary that was basically identical to the code itself. llvm-svn: 261722
* [X86ISelLowering] Stop typing the same return over and over and over.Davide Italiano2016-02-231-11/+14
| | | | llvm-svn: 261666
* AVX512: Fix predicate of AVX pcmpeqw/b , pcmpgtb/w/d instructions . AVX512 ↵Igor Breger2016-02-232-6/+8
| | | | | | | | version of this instructions return result in kmask register, so AVX patterns should not be disabled. Differential Revision: http://reviews.llvm.org/D17517 llvm-svn: 261619
* CodeGen: TII: Take MachineInstr& in predicate API, NFCDuncan P. N. Exon Smith2016-02-232-6/+6
| | | | | | | | | | | | | Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605
* [X86] Create mergeable constant pool entries for AVXDavid Majnemer2016-02-221-0/+5
| | | | | | | We supported creating mergeable constant pool entries for smaller constants but not for 32-byte AVX constants. llvm-svn: 261584
* [X86ISelLowering] Consolidate duplicated code in a single place.Davide Italiano2016-02-222-24/+16
| | | | llvm-svn: 261573
* Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"Duncan P. N. Exon Smith2016-02-221-2/+1
| | | | | | | | | | This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567
* AVX512F: Add assembler Intel syntax tests for knl, fix minor bugs.Igor Breger2016-02-221-5/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D17498 llvm-svn: 261521
* AVX512: Fix scalar mem operands.Igor Breger2016-02-221-14/+17
| | | | | | Differential Revision: http://reviews.llvm.org/D17500 llvm-svn: 261520
* [X86] Minor formatting fix. NFCCraig Topper2016-02-221-9/+9
| | | | llvm-svn: 261515
* Document assumption in X86FrameLowering::inlineStackProbe()Duncan P. N. Exon Smith2016-02-221-1/+2
| | | | | | | | Resolve FIXME from r261504. Apparently bundled instructions are illegal here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160215/334146.html llvm-svn: 261507
* CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFCDuncan P. N. Exon Smith2016-02-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504
* ADT: Remove == and != comparisons between ilist iterators and pointersDuncan P. N. Exon Smith2016-02-211-1/+1
| | | | | | | | | | | | | | I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498
* [X86] Remove unused encoding types from disassembler. NFCCraig Topper2016-02-213-22/+0
| | | | llvm-svn: 261494
* [X86][AVX] Add shuffle masking support for EltsFromConsecutiveLoadsSimon Pilgrim2016-02-211-3/+25
| | | | | | | | Add support for the case where we have a consecutive load (which must include the first + last elements) with a mixture of undef/zero elements. We load the vector and then apply a shuffle to clear the zero'd elements. Differential Revision: http://reviews.llvm.org/D17297 llvm-svn: 261490
* Fix LLVM's handling and detection of skylake and cannonlake CPUsSanjoy Das2016-02-211-3/+2
| | | | | | | | | | | | | | | | | Summary: - Rename `"skylake"` == SkylakeServerProc to `"skylake-avx512"` - Change `"skylake"` to denote SkylakeClientProc - Fix the detection of cpu family 6 and model 94 to be SkylakeClientProc instead of SkylakeServerProc - Remove the `"cnl"` for CannonLake Reviewers: craig.topper, delena Subscribers: zansari, echristo, qcolombet, RKSimon, spatel, DavidKreitzer, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17090 llvm-svn: 261482
* [X86] Use the correct alignment for COMDAT constant pool entriesDavid Majnemer2016-02-213-8/+23
| | | | | | | | | | | | | | | | | | | COFF doesn't have sections with mergeable contents. Instead, each constant pool entry ends up in a COMDAT section. The linker, when choosing between COMDAT sections, doesn't choose the max alignment of the two sections. You just get whatever alignment was on the section. If one constant needed a higher alignment in one object file from another one, then we will get into trouble if the linker chooses the lower alignment one. Instead, lets promote the alignment of the constant pool entry to make sure we don't use an under aligned constant with an instruction which assumed otherwise. This fixes PR26680. llvm-svn: 261462
* [X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles ↵Simon Pilgrim2016-02-201-5/+4
| | | | | | | | (PR26667) Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input. llvm-svn: 261434
* [X86][SSE] Move all undef/zero cases before target shuffle combining.Simon Pilgrim2016-02-201-20/+14
| | | | | | First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041). llvm-svn: 261433
* [X86] Enable the LEA optimization pass by default.Andrey Turetskiy2016-02-201-4/+5
| | | | | | Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 261429
* [X86] PR26575: Fix LEA optimization pass (Part 2).Andrey Turetskiy2016-02-201-36/+78
| | | | | | | | | | Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17374 llvm-svn: 261428
* Move some code from doInitialization to runOnFunctionDavid Majnemer2016-02-201-3/+4
| | | | | | | This has no observable behavior change, it just makes the state insertion pass look a little more like normal passes. llvm-svn: 261420
* [X86] Add some missing reversed forms of XOP instructions.Craig Topper2016-02-201-0/+29
| | | | llvm-svn: 261417
* [X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled.Davide Italiano2016-02-203-2/+39
| | | | | | | | | | TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent shrink-wrapping from pushing prologue/epilogue past them (which result in TLS variables being accessed before the stack frame is set up), we put markers, so that the stack gets adjusted properly. Thanks to Quentin Colombet for guidance/help on how to fix this problem! llvm-svn: 261387
* [X86ISelLowering] Provide a more informative assert message.Davide Italiano2016-02-191-1/+1
| | | | | | I stumbled upon this while debugging a lowering bug. llvm-svn: 261371
* [X86ISelLowering] Merge two conditions inside a single if.Davide Italiano2016-02-191-3/+1
| | | | llvm-svn: 261370
* Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky"Hans Wennborg2016-02-191-32/+14
| | | | | | Turns out the new nop sequences aren't actually nops on x86_64 (PR26554). llvm-svn: 261365
* Fix incorrect selection of AVX512 sqrt when OptForSize is onDimitry Andric2016-02-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | Summary: When optimizing for size, sqrt calls can be incorrectly selected as AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass definition. Even if the target does not support AVX512, the class can apparently still be chosen, leading to an incorrect selection of `vsqrtss`. In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction requires an XMM register, which is not available on i686 CPUs. Reviewers: grosbach, resistor, joker.eph Subscribers: spatel, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D17414 llvm-svn: 261360
* [X86] Remove unused entries from the disassembler type enum.Craig Topper2016-02-193-6/+0
| | | | llvm-svn: 261311
* [x86] fix initialization of PredictableSelectIsExpensiveSanjay Patel2016-02-181-3/+3
| | | | | | | | | | This is effectively NFC because Atom is the only in-order x86 subtarget currently, but the predicate would have become wrong if any other in-order CPU came along. See related discussion in: http://reviews.llvm.org/D16836 llvm-svn: 261275
* Remove uses of builtin comma operator.Richard Trieu2016-02-181-1/+2
| | | | | | Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270
* [WinEH] Hoist state stores from successorsDavid Majnemer2016-02-181-1/+54
| | | | | | | | | | If we know that all of our successors want to be in the exact same state, it makes sense to hoist the state transition into their common predecessor. Differential Revision: http://reviews.llvm.org/D17391 llvm-svn: 261262
* [X86ISelLowering] Use isPowerof2 instead of rewriting it. NFC.Davide Italiano2016-02-181-1/+1
| | | | llvm-svn: 261255
* Revert to extend i8/i16 return values on Darwin (PR26665)Hans Wennborg2016-02-181-1/+6
| | | | | | | | | | | | | In r260133, LLVM was changed to no longer extend i8/i16 return values, as it's not required by the ABI. However, code was found in the wild that relies on the old behaviour on Darwin, so this commit reverts back to that old behaviour for Darwin. On other platforms, it's less likely that code would be depending on the old behaviour, as GCC and MSVC haven't been extending such return values. llvm-svn: 261235
* [X86][SSE] Improve PSHUFB shuffle mask decoding.Simon Pilgrim2016-02-181-16/+36
| | | | | | | | In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool. The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly. llvm-svn: 261201
* [AVX512][PRORQ][PRORD] Change imm8 to intMichael Zuckerman2016-02-181-6/+6
| | | | | | Differential Revision: http://reviews.llvm.org/D17024 llvm-svn: 261198
* Remove superfluous semicolon.Nico Weber2016-02-171-1/+1
| | | | llvm-svn: 261128
* [WinEH] Optimize WinEH state storesDavid Majnemer2016-02-171-32/+175
| | | | | | | | | | | | | | | | | | | | | | | | | 32-bit x86 Windows targets use a linked-list of nodes allocated on the stack, referenced to via thread-local storage. The personality routine interprets one of the fields in the node as a 'state number' which indicates where the personality routine should transfer control. State transitions are possible only before call-sites which may throw exceptions. Our previous scheme had us update the state number before all call-sites which may throw. Instead, we can try to minimize the number of times we need to store by reasoning about the nearest store which dominates the current call-site. If the last store agrees with the current call-site, then we know that the state-update is redundant and can be elided. This is largely straightforward: an RPO walk of the blocks allows us to correctly forward propagate the information when the function is a DAG. Currently, loops are not handled optimally and may trigger superfluous state stores. Differential Revision: http://reviews.llvm.org/D16763 llvm-svn: 261122
* AVX512: Fix LowerMSCATTER() return value.Igor Breger2016-02-171-1/+1
| | | | | | | | | | | Bug description: The bug was discovered when test was compiled with -O0. In case scatter result is DAG root , VectorLegalizer failed (assert) due to LowerMSCATTER() return kmask as result. Change LowerMSCATTER() to return chain as original node do. Differential Revision: http://reviews.llvm.org/D17331 llvm-svn: 261090
* [X86][AVX] Support bit-blend integer shuffles for 256-bit integer vectorsSimon Pilgrim2016-02-171-1/+3
| | | | | | | | | | | | AVX1 doesn't support the shuffling of 256-bit integer vectors. For 32/64-bit elements we get around this by shuffling as float/double but for 8/16-bit elements (assuming they can't widen) we currently just split, shuffle as 128-bit vectors and concatenate the results back. This patch adds the ability to lower using the bit-blend patterns before defaulting to the splitting behaviour. Part 2 of 2 Differential Revision: http://reviews.llvm.org/D17292 llvm-svn: 261082
* [X86][AVX] Support bit-mask integer shuffles for 256-bit integer vectorsSimon Pilgrim2016-02-171-2/+6
| | | | | | | | | | | | AVX1 doesn't support the shuffling of 256-bit integer vectors. For 32/64-bit elements we get around this by shuffling as float/double but for 8/16-bit elements (assuming they can't widen) we currently just split, shuffle as 128-bit vectors and concatenate the results back. This patch adds the ability to lower using the bit-mask patterns before defaulting to the splitting behaviour. In some cases this ends up matching what AVX2 would do anyhow or what AVX1 does on the split vectors. Part 1 of 2 Differential Revision: http://reviews.llvm.org/D17292 llvm-svn: 261081
* [X86][SSE] Tidyup BUILD_VECTOR operand collection. NFCI.Simon Pilgrim2016-02-171-23/+20
| | | | | | | | Avoid reuse of operand variables, keep them local to a particular lowering - the operand collection is unique to each case anyhow. Renamed from V to Ops to more closely match their purpose. llvm-svn: 261078
* Revert r260979 "[X86] Enable the LEA optimization pass by default."Hans Wennborg2016-02-171-5/+4
| | | | | | Asserts are still firing in Chromium builds. PR26575. llvm-svn: 261058
* [X86] Fix a shrink-wrapping miscompile around __chkstkReid Kleckner2016-02-171-7/+6
| | | | | | | | | __chkstk clobbers EAX. If EAX is live across the prologue, then we have to take extra steps to save it. We already had code to do this if EAX was a register parameter. This change adapts it to work when shrink wrapping is used. llvm-svn: 261039
* [X86] Remove the now-unused X86ISD::PSIGN. NFC.Ahmed Bougacha2016-02-166-46/+30
| | | | llvm-svn: 261025
OpenPOWER on IntegriCloud