summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86/X86ISelLowering.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* Move variable local to where ita used. NFCI.Simon Pilgrim2017-04-281-1/+1
| | | | llvm-svn: 301646
* [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and ↵Craig Topper2017-04-281-27/+26
| | | | | | | | | | | | simplifyDemandedBits This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently. This is largely a mechanical transformation from KnownZero to Known.Zero. Differential Revision: https://reviews.llvm.org/D32569 llvm-svn: 301620
* [SelectionDAG] Use various APInt methods to reduce temporary APInt creationCraig Topper2017-04-281-1/+1
| | | | | | This patch uses various APInt methods to reduce the number of temporary APInts. These were all found while working through converting SelectionDAG's computeKnownBits to also use the KnownBits struct recently added to the ValueTracking version. llvm-svn: 301618
* [APInt] Use inplace shift methods where possible. NFCICraig Topper2017-04-281-6/+5
| | | | llvm-svn: 301612
* [SelectionDAG] Added getBuildVector(ArrayRef<SDUse>) helper.Simon Pilgrim2017-04-251-4/+4
| | | | llvm-svn: 301322
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-5/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301234
* Revert r301231: Accidentally committed stale filesKrzysztof Parzyszek2017-04-241-7/+5
| | | | | | I forgot to commit local changes before commit. llvm-svn: 301232
* Move value type list from TargetRegisterClass to TargetRegisterInfoKrzysztof Parzyszek2017-04-241-5/+7
| | | | | | Differential Revision: https://reviews.llvm.org/D31937 llvm-svn: 301231
* Revert "[APInt] Fix a few places that use APInt::getRawData to operate ↵Renato Golin2017-04-231-5/+6
| | | | | | | | | | | | | | | | within the normal API." This reverts commit r301105, 4, 3 and 1, as a follow up of the previous revert, which broke even more bots. For reference: Revert "[APInt] Use operator<<= where possible. NFC" Revert "[APInt] Use operator<<= instead of shl where possible. NFC" Revert "[APInt] Use ashInPlace where possible." PR32754. llvm-svn: 301111
* [APInt] Use operator<<= where possible. NFCCraig Topper2017-04-231-2/+2
| | | | llvm-svn: 301104
* [APInt] Use operator<<= instead of shl where possible. NFCCraig Topper2017-04-231-2/+1
| | | | llvm-svn: 301103
* [APInt] Use ashInPlace where possible.Craig Topper2017-04-231-2/+2
| | | | llvm-svn: 301101
* [AArch64] Improve code generation for logical instructions takingAkira Hatanaka2017-04-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300932 and r300930, which was causing dag-combine to loop forever. The problem was that optimizeLogicalImm was returning true even when there was no change to the immediate node (which happened when the immediate was all zeros or ones), which caused dag-combine to push and pop the same node to the work list over and over again without making any progress. This commit fixes the bug by returning false early in optimizeLogicalImm if the immediate is all zeros or ones. Also, it changes the code to compare the immediate with 0 or Mask rather than calling countPopulation. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 301019
* Revert r300932 and r300930.Akira Hatanaka2017-04-211-2/+2
| | | | | | | | | It seems that r300930 was creating an infinite loop in dag-combine when compling the following file: MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c llvm-svn: 300940
* [AArch64] Improve code generation for logical instructions takingAkira Hatanaka2017-04-211-2/+2
| | | | | | | | | | | | | | | | | | | | immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. This recommits r300913, which broke bots because I didn't fix a call to ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of TargetLoweringOpt and TargetLowering. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300930
* Revert "[AArch64] Improve code generation for logical instructions taking"Akira Hatanaka2017-04-201-2/+2
| | | | | | | | This reverts r300913. This broke bots. llvm-svn: 300916
* [AArch64] Improve code generation for logical instructions takingAkira Hatanaka2017-04-201-2/+2
| | | | | | | | | | | | | | | | immediate operands. This commit adds an AArch64 dag-combine that optimizes code generation for logical instructions taking immediate operands. The optimization uses demanded bits to change a logical instruction's immediate operand so that the immediate can be folded into the immediate field of the instruction. rdar://problem/18231627 Differential Revision: https://reviews.llvm.org/D5591 llvm-svn: 300913
* [APInt] Rename getSignBit to getSignMaskCraig Topper2017-04-201-16/+16
| | | | | | | | getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask. Differential Revision: https://reviews.llvm.org/D32108 llvm-svn: 300856
* PR32710: Disable using PMADDWD for unsigned short.Dehao Chen2017-04-191-1/+1
| | | | | | | | | | | | | | Summary: PMADDWD can only handle signed short. Reviewers: mkuper, wmi Reviewed By: mkuper Subscribers: andreadb, llvm-commits Differential Revision: https://reviews.llvm.org/D32236 llvm-svn: 300737
* Add a getPointerOperandType() helper to LoadInst and StoreInst; NFCSanjoy Das2017-04-181-1/+1
| | | | | | I will use this in a later change. llvm-svn: 300613
* DAG: Make mayBeEmittedAsTailCall parameter constMatt Arsenault2017-04-181-2/+2
| | | | llvm-svn: 300603
* [X86] Use for-range loop. NFCI.Simon Pilgrim2017-04-181-2/+2
| | | | llvm-svn: 300567
* [APInt] Use lshrInPlace to replace lshr where possibleCraig Topper2017-04-181-5/+5
| | | | | | | | | | This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result. This adds an lshrInPlace(const APInt &) version as well. Differential Revision: https://reviews.llvm.org/D32155 llvm-svn: 300566
* [X86] Remove special handling for 16 bit for A asm constraints.Benjamin Kramer2017-04-161-6/+3
| | | | | | | | | | Our 16 bit support is assembler-only + the terrible hack that is .code16gcc. Simply using 32 bit registers does the right thing for the latter. Fixes PR32681. llvm-svn: 300429
* Use correct registers for "A" inline asm constraintDimitry Andric2017-04-151-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: In PR32594, inline assembly using the 'A' constraint on x86_64 causes llvm to crash with a "Cannot select" stack trace. This is because `X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A' means the EAX and EDX registers. However, on x86_64 it means the RAX and RDX registers, and on 16-bit x86 (ia16?) it means the old AX and DX registers. Add new register classes in `X86RegisterInfo.td` to support these cases, and amend the logic in `getRegForInlineAsmConstraint` to cope with different subtargets. Also add a test case, derived from PR32594. Reviewers: craig.topper, qcolombet, RKSimon, ab Reviewed By: ab Subscribers: ab, emaste, royger, llvm-commits Differential Revision: https://reviews.llvm.org/D31902 llvm-svn: 300404
* [X86] Create the correct ADC/SBB SDNode when lowering add.Davide Italiano2017-04-111-2/+4
| | | | | | Differential Revision: https://reviews.llvm.org/D31911 llvm-svn: 299973
* Module::getOrInsertFunction is using C-style vararg instead of variadic ↵Serge Guelton2017-04-111-1/+1
| | | | | | | | | | | templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299949
* Revert "Turn some C-style vararg into variadic templates"Diana Picus2017-04-111-1/+1
| | | | | | | This reverts commit r299925 because it broke the buildbots. See e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15/builds/6008 llvm-svn: 299928
* Turn some C-style vararg into variadic templatesSerge Guelton2017-04-111-1/+1
| | | | | | | | | | | | Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. llvm-svn: 299925
* Use PMADDWD to expand reduction in a loopDehao Chen2017-04-071-0/+47
| | | | | | | | | | | | | | | | | | Summary: PMADDWD can help improve 8/16 bit integer mutliply-add operation performance for cases like: for (int i = 0; i < count; i++) a += x[i] * y[i]; Reviewers: wmi, davidxl, hfinkel, RKSimon, zvi, mkuper Reviewed By: mkuper Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31679 llvm-svn: 299776
* [X86] Revert r299387 due to AVX legalization infinite loop.Michael Kuperstein2017-04-061-55/+1
| | | | llvm-svn: 299720
* Revert "Turn some C-style vararg into variadic templates"Mehdi Amini2017-04-061-3/+4
| | | | | | This reverts commit r299699, the examples needs to be updated. llvm-svn: 299702
* Turn some C-style vararg into variadic templatesMehdi Amini2017-04-061-4/+3
| | | | | | | | | | | | | | | | Module::getOrInsertFunction is using C-style vararg instead of variadic templates. From a user prospective, it forces the use of an annoying nullptr to mark the end of the vararg, and there's not type checking on the arguments. The variadic template is an obvious solution to both issues. Patch by: Serge Guelton <serge.guelton@telecom-bretagne.eu> Differential Revision: https://reviews.llvm.org/D31070 llvm-svn: 299699
* [X86][SSE] Renamed combine to make it clear that it only handles the vector ↵Simon Pilgrim2017-04-051-4/+5
| | | | | | shift by immediate opcodes. NFCI llvm-svn: 299532
* [X86] Relax assert in broadcast-of-subvector lowering.Ahmed Bougacha2017-04-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before r294774, there was a problem when lowering broadcasts to use 128-bit subvectors. When we looked through a bitcast to find the broadcast input, we'd keep using the original type, so you'd end up with things like: (v8f32 (broadcast (v4f32 (extract_subvector (v8i32 V), ...)) )) r294774 fixed it to always emit subvectors with the scalar type of the original source. It also introduced some asserts, to check that we use scalars with the same size, and vectors with the same number of elements. The scalar size equality is checked earlier when looking through bitcasts, and is a useful assert. However, the number of elements don't have to be identical: we're always going to extract a 128-bit subvector, and we can have different size inputs if we looked through a concat_vector to find a 256-bit source. Relax the overzealous assert. Replace it with a check of the original source vector being 256 or 512 bits. If it's 128 bits, we can't extract_subvector from it. Fixes PR32371. llvm-svn: 299490
* [x86] remove dead select-of-constants transform; NFCISanjay Patel2017-04-041-12/+0
| | | | | | | | https://reviews.llvm.org/D30537 / https://reviews.llvm.org/rL296977 added these transforms and other related transforms to the generic DAGCombiner (with a hook that x86 sets to true), so these patterns should not exist by the time we reach the target-specific combiner hook. llvm-svn: 299448
* Strip trailing whitespaceSimon Pilgrim2017-04-041-4/+4
| | | | llvm-svn: 299438
* [X86] Add 64 bit pattern matching for PSADBWOren Ben Simhon2017-04-041-13/+41
| | | | | | | | | PSADBW pattern currently supports the 32 bit IR pattern and only GLT (greather than) comparison. The patch extends the pattern to catch also 64 bit IR pattern and includes all other comparison types (not only GLT). Differential Revision: https://reviews.llvm.org/D31577 llvm-svn: 299425
* [X86][SSE]] Lower BUILD_VECTOR with repeated elts as BUILD_VECTOR + ↵Simon Pilgrim2017-04-031-1/+55
| | | | | | | | | | | | | | | | VECTOR_SHUFFLE It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging. This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values. There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch. Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this. Differential Revision: https://reviews.llvm.org/D31373 llvm-svn: 299387
* x86 interrupt calling convention: re-align stack pointer on 64-bit if an ↵Amjad Aboud2017-04-031-2/+8
| | | | | | | | | | | | | | | | error code was pushed The x86_64 ABI requires that the stack is 16 byte aligned on function calls. Thus, the 8-byte error code, which is pushed by the CPU for certain exceptions, leads to a misaligned stack. This results in bugs such as Bug 26413, where misaligned movaps instructions are generated. This commit fixes the misalignment by adjusting the stack pointer in these cases. The adjustment is done at the beginning of the prologue generation by subtracting another 8 bytes from the stack pointer. These additional bytes are popped again in the function epilogue. Fixes Bug 26413 Patch by Philipp Oppermann. Differential Revision: https://reviews.llvm.org/D30049 llvm-svn: 299383
* [APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt ↵Craig Topper2017-04-031-2/+2
| | | | | | | | | | class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 llvm-svn: 299362
* [X86][MMX] Improve support for folding fptosi from XMM to MMXSimon Pilgrim2017-04-021-0/+10
| | | | llvm-svn: 299338
* [X86][MMX] Simplify tablegen patterns by always combining MOVDQ2Q from v2i64Simon Pilgrim2017-04-021-1/+2
| | | | llvm-svn: 299336
* [X86][MMX] Added support for subvector extraction to MMX registerSimon Pilgrim2017-04-021-2/+4
| | | | llvm-svn: 299335
* [AVX-512] Update lowering for gather/scatter prefetch intrinsics to match ↵Craig Topper2017-03-311-3/+3
| | | | | | | | | | the immediate encodings the frontend uses based on the _MM_HINT_T0/T1 constant values in clang's headers. Our _MM_HINT_T0/T1 constant values are 3/2 which matches gcc, but not icc or Intel documentation. Interestingly gcc had this same bug on their implementation of the gather/scatter builtins at one point too. Fixes PR32411. llvm-svn: 299234
* [DAGCombiner] Add vector demanded elements support to ComputeNumSignBitsSimon Pilgrim2017-03-311-1/+2
| | | | | | | | | | | | | | Currently ComputeNumSignBits returns the minimum number of sign bits for all elements of vector data, when we may only be interested in one/some of the elements. This patch adds a DemandedElts argument that allows us to specify the elements we actually care about. The original ComputeNumSignBits implementation calls with a DemandedElts demanding all elements to match current behaviour. Scalar types set this to 1. I've only added support for BUILD_VECTOR and EXTRACT_VECTOR_ELT so far, all others will default to demanding all elements but can be updated in due course. Followup to D25691. Differential Revision: https://reviews.llvm.org/D31311 llvm-svn: 299219
* [DAGCombiner] Add vector demanded elements support to ↵Simon Pilgrim2017-03-311-0/+1
| | | | | | | | | | computeKnownBitsForTargetNode Follow up to D25691, this sets up the plumbing necessary to support vector demanded elements support in known bits calculations in target nodes. Differential Revision: https://reviews.llvm.org/D31249 llvm-svn: 299201
* Spelling mistakes in comments. NFCI.Simon Pilgrim2017-03-301-15/+15
| | | | llvm-svn: 299069
* [X86IselLowering] Remove extraneous semicolon. NFCI.Davide Italiano2017-03-291-1/+1
| | | | | | Unbreaks the build with GCC -Werror. llvm-svn: 299030
* [X86] Tidied up comment - we don't custom lower add/sub i64 on i686 anymore. ↵Simon Pilgrim2017-03-291-1/+2
| | | | | | NFCI. llvm-svn: 299004
OpenPOWER on IntegriCloud