summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* Fix gcc warnings of different enum and non-enum types in ternariesDenis Protivensky2015-07-072-9/+9
| | | | llvm-svn: 241567
* [X86][AVX] Add support for shuffle decoding of vperm2f128/vperm2i128 with ↵Simon Pilgrim2015-07-062-5/+8
| | | | | | | | | | | | zero'd lanes The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments. As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well. Differential Revision: http://reviews.llvm.org/D10593 llvm-svn: 241516
* [x86] extend machine combiner reassociation optimization to SSE scalar addsSanjay Patel2015-07-061-13/+19
| | | | | | | | | | | | Extend the reassociation optimization of http://reviews.llvm.org/rL240361 (D10460) to SSE scalar FP SP adds in addition to AVX scalar FP SP adds. With the 'switch' in place, we can trivially add other opcodes and test cases in future patches. Differential Revision: http://reviews.llvm.org/D10975 llvm-svn: 241515
* [X86][SSE] Vectorized i64 uniform constant SRA shiftsSimon Pilgrim2015-07-062-2/+51
| | | | | | | | This patch adds vectorization support for uniform constant i64 arithmetic shift right operators. Differential Revision: http://reviews.llvm.org/D9645 llvm-svn: 241514
* [X86][SSE4A] Shuffle lowering using SSE4A EXTRQ/INSERTQ instructionsSimon Pilgrim2015-07-068-6/+300
| | | | | | | | | | | | This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it. As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions. From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level. Differential Revision: http://reviews.llvm.org/D10146 llvm-svn: 241508
* [X86][SSE] Use the general SMAX/SMIN/UMAX/UMIN opcodes and remove the X86 ↵Simon Pilgrim2015-07-066-138/+177
| | | | | | | | | | | | implementation With the completion of D9746 there is now a common implementation of integer signed/unsigned min/max nodes, removing the need for the equivalent X86 specific implementations. This patch removes the old X86ISD nodes, legalizes the relevant SSE2/SSE41/AVX2/AVX512 instructions for the ISD versions and converts the small amount of existing X86 code. Differential Revision: http://reviews.llvm.org/D10947 llvm-svn: 241506
* Change the last few internal StringRef triples into Triple objects.Daniel Sanders2015-07-061-16/+12
| | | | | | | | | | | | | | | | | | | | Summary: This concludes the patch series to eliminate StringRef forms of GNU triples from the internals of LLVM that began in r239036. At this point, the StringRef-form of GNU Triples should only be used in the public API (including IR serialization) and a couple objects that directly interact with the API (most notably the Module class). The next step is to replace these Triple objects with the TargetTuple object that will represent our authoratative/unambiguous internal equivalent to GNU Triples. Reviewers: rengolin Subscribers: llvm-commits, jholewinski, ted, rengolin Differential Revision: http://reviews.llvm.org/D10962 llvm-svn: 241472
* [X86][AVX512] Multiply Packed Unsigned Integers with Round and ScaleAsaf Badouh2015-07-065-0/+9
| | | | | | | | | pmulhrsw review: http://reviews.llvm.org/D10948 llvm-svn: 241443
* IR: Do not consider available_externally linkage to be linker-weak.Peter Collingbourne2015-07-053-8/+7
| | | | | | | | | | | | | | | From the linker's perspective, an available_externally global is equivalent to an external declaration (per isDeclarationForLinker()), so it is incorrect to consider it to be a weak definition. Also clean up some logic in the dead argument elimination pass and clarify its comments to better explain how its behavior depends on linkage, introduce GlobalValue::isStrongDefinitionForLinker() and start using it throughout the optimizers and backend. Differential Revision: http://reviews.llvm.org/D10941 llvm-svn: 241413
* [TargetLowering] StringRefize asm constraint getters.Benjamin Kramer2015-07-052-10/+8
| | | | | | | | There is some functional change here because it changes target code from atoi(3) to StringRef::getAsInteger which has error checking. For valid constraints there should be no difference. llvm-svn: 241411
* [x86][AVX512] add Multiply High OpAsaf Badouh2015-07-053-0/+12
| | | | | | | | | include encoding and intrinsics tests. review http://reviews.llvm.org/D10896 llvm-svn: 241406
* [X86] Fix incorrect/inefficient pushw encodings for x86-64 targetsMichael Kuperstein2015-07-052-9/+4
| | | | | | | | | | | | | Correctly support assembling "pushw $imm8" on x86-64 targets. Also some cleanup of the PUSH instructions (PUSH64i16 and PUSHi16 actually represent the same instruction) This fixes PR23996 Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10878 llvm-svn: 241404
* [X86][SSE] Improved i8/i16 to f64 uint2fp vector conversionsSimon Pilgrim2015-07-041-0/+27
| | | | | | Followup to D10433 and D10589 that fixes i8/i16 uint2fp vector conversions by zero extending to i32 and using the sint2fp path (unless the target does actually support uint2fp). llvm-svn: 241394
* [X86] Add proper 64-bit mode checks to jrcxz and jcxz.Craig Topper2015-07-041-2/+4
| | | | llvm-svn: 241381
* Return ErrorOr from getSymbolAddress.Rafael Espindola2015-07-031-2/+4
| | | | | | | It can fail trying to get the section on ELF and COFF. This makes sure the error is handled. llvm-svn: 241366
* Replace a few more MachO only uses of getSymbolAddress.Rafael Espindola2015-07-031-3/+2
| | | | llvm-svn: 241365
* [X86][SSE] Sign extension for target vector sizes less than 128 bits (pt2)Simon Pilgrim2015-07-031-7/+9
| | | | | | | | Add support for v2i8/v2i16 to v2f64 by using a sign extension to v2i32 before conversion to v2f64. Differential Revision: http://reviews.llvm.org/D10589 llvm-svn: 241325
* [X86][SSE] Sign extension for target vector sizes less than 128 bits (pt1)Simon Pilgrim2015-07-031-6/+20
| | | | | | | | This patch adds support for sign extension for sub 128-bit vectors, such as to v2i32. It concatenates with UNDEF subvectors up to 128-bits, performs the sign extension (i.e. as v4i32) and then extracts the target subvector. Patch 1/2 of D10589 - the second patch covers the conversion of v2i8/v2i16 to v2f64. llvm-svn: 241323
* Return ErrorOr from SymbolRef::getName.Rafael Espindola2015-07-022-5/+13
| | | | | | | | | | | | This function can really fail since the string table offset can be out of bounds. Using ErrorOr makes sure the error is checked. Hopefully a lot of the boilerplate code in tools/* can go away once we have a diagnostic manager in Object. llvm-svn: 241297
* Implement TargetTransformInfo::hasCompatibleFunctionAttributes for X86.Eric Christopher2015-07-022-0/+17
| | | | | | | | | | | | | | | | | | | This checks subtarget feature compatibility for inlining by verifying that the callee is a strict subset of the caller's features. This includes the cpu as part of the subtarget we can get via the incoming functions as the backend takes CPUs as feature sets. This allows us to inline things like: int foo() { return baz(); } int __attribute__((target("sse4.2"))) bar() { return foo(); } so that generic code can be inlined into specialized functions. llvm-svn: 241221
* fix typos in comment; NFCSanjay Patel2015-07-011-3/+2
| | | | llvm-svn: 241174
* [SEH] Don't assert if the parent function lacks a personalityReid Kleckner2015-07-011-0/+6
| | | | | | | | The EH code might have been deleted as unreachable and the personality pruned while the filter is still present. Currently I'm hitting this at -O0 due to the clang bug PR24009. llvm-svn: 241170
* AVX-512: Implemented missing encoding for FMA scalar instructionsIgor Breger2015-07-011-34/+95
| | | | | | | | Added tests for encoding Differential Revision: http://reviews.llvm.org/D10865 llvm-svn: 241159
* [X86] Avoid over-relaxation of 8-bit immediates in integer arithmetic ↵Michael Kuperstein2015-07-011-24/+6
| | | | | | | | | | | | | | | | | | instructions. Only consider an instruction a candidate for relaxation if the last operand of the instruction is an expression. We previously checked whether any operand is an expression, which is useless, since for all instructions concerned, the only operand that may be affected by relaxation is the last one. In addition, this removes the check for having RIP as an argument, since it was plain wrong - even when one of the arguments is RIP, relaxation may still be needed. This fixes PR9807. Patch by: david.l.kreitzer@intel.com Differential Revision: http://reviews.llvm.org/D10766 llvm-svn: 241152
* [SEH] Add new intrinsics for recovering and restoring parent framesReid Kleckner2015-06-302-33/+88
| | | | | | | | | | | | | | | | | | | | | | | | | The incoming EBP value established by the runtime is actually a pointer to the end of the EH registration object, and not the true parent function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo anymore because we know that the exception info pointer is at a fixed offset from this incoming EBP. The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the EH runtime and returns a pointer that is usable with llvm.framerecover. The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit specific preparation pass in blocks targetted by the EH runtime. It re-establishes any physical registers used by the parent function to address the stack, such as the frame, base, and stack pointers. Neither of these intrinsics correctly handle stack realignment prologues yet, but it's possible to add that later. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D10848 llvm-svn: 241125
* [X86] Fix a bug in WIN_FTOL_32/64 handling.Michael Kuperstein2015-06-301-1/+1
| | | | | | | | | | Duplicating an FP register "as itself" is a bad idea, since it violates the invariant that every FP register is mapped to at most one FPU stack slot. Use the scratch FP register instead. This fixes PR23957. llvm-svn: 241069
* Reverting r241058 because it's causing buildbot failures.Ranjeet Singh2015-06-301-32/+25
| | | | llvm-svn: 241061
* There are a few places where subtarget features are stillRanjeet Singh2015-06-301-25/+32
| | | | | | | | | represented by uint64_t, this patch replaces these usages with the FeatureBitset (std::bitset) type. Differential Revision: http://reviews.llvm.org/D10542 llvm-svn: 241058
* [X86] Add FXSR intrinsicsMichael Kuperstein2015-06-301-8/+8
| | | | | | Add intrinsics for the FXSR instructions (FXSAVE/FXSAVE64/FXRSTOR/FXRSTOR64) llvm-svn: 241049
* Don't return error_code from a function that doesn't fail.Rafael Espindola2015-06-302-2/+2
| | | | llvm-svn: 241033
* Cleanup getRelocationAddend.Rafael Espindola2015-06-301-2/+1
| | | | | | | | | | | Realistically, this will be returning ErrorOr for some time as refactoring the user code to check once per section will take some time. Given that, use it for checking if a relocation has addend or not. While at it, add ELFRelocationRef to simplify the users. llvm-svn: 241028
* Teach LTOModule to emit linker flags for dllexported symbols, plus interface ↵Peter Collingbourne2015-06-292-56/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cleanup. This change unifies how LTOModule and the backend obtain linker flags for globals: via a new TargetLoweringObjectFile member function named emitLinkerFlagsForGlobal. A new function LTOModule::getLinkerOpts() returns the list of linker flags as a single concatenated string. This change affects the C libLTO API: the function lto_module_get_*deplibs now exposes an empty list, and lto_module_get_*linkeropts exposes a single element which combines the contents of all observed flags. libLTO should never have tried to parse the linker flags; it is the linker's job to do so. Because linkers will need to be able to parse flags in regular object files, it makes little sense for libLTO to have a redundant mechanism for doing so. The new API is compatible with the old one. It is valid for a user to specify multiple linker flags in a single pragma directive like this: #pragma comment(linker, "/defaultlib:foo /defaultlib:bar") The previous implementation would not have exposed either flag via lto_module_get_*deplibs (as the test in TargetLoweringObjectFileCOFF::getDepLibFromLinkerOpt was case sensitive) and would have exposed "/defaultlib:foo /defaultlib:bar" as a single flag via lto_module_get_*linkeropts. This may have been a bug in the implementation, but it does give us a chance to fix the interface. Differential Revision: http://reviews.llvm.org/D10548 llvm-svn: 241010
* X86: Rework inline asm integer register specification.Matthias Braun2015-06-293-73/+60
| | | | | | | | | | | | | | | | | | | | | | | This is a new version of http://reviews.llvm.org/D10260. It turned out that when you specify an integer register in inline asm on x86 you get the register of the required type size back. That means that X86TargetLowering::getRegForInlineAsmConstraint() has to accept any of the integer registers and adapt its size to the given target size which may be any 8/16/32/64 bit sized type. Surprisingly that means given a constraint of "{ax}" and a type of MVT::F32 we need to return X86::EAX. This change makes this face explicit, the previous code seemed like working by accident because there it never returned an error once a register was found. On the other hand this rewrite allows to actually return errors for invalid situations like requesting an integer register for an i128 type. Related to rdar://21042280 Differential Revision: http://reviews.llvm.org/D10813 llvm-svn: 241002
* AVX-512: all forms of SCATTER instruction on SKX,Elena Demikhovsky2015-06-294-32/+102
| | | | | | encoding, intrinsics and tests. llvm-svn: 240936
* AVX-512: Implemented missing encoding and intrinsics for FMA instructionsIgor Breger2015-06-293-153/+272
| | | | | | | | Added tests for DAG lowering ,encoding and intrinsics Differential Revision: http://reviews.llvm.org/D10796 llvm-svn: 240926
* Whitespace.NAKAMURA Takumi2015-06-291-15/+15
| | | | llvm-svn: 240924
* [x86][AVX512]Asaf Badouh2015-06-285-9/+83
| | | | | | | | | | | Add vscalef support include encoding and intrinsics review: http://reviews.llvm.org/D10730 llvm-svn: 240906
* AVX-512: Added all SKX forms of GATHER instructions.Elena Demikhovsky2015-06-286-25/+120
| | | | | | | Added intrinsics. Added encoding and tests. llvm-svn: 240905
* Revert "Revert r240762 "[X86] Cleanup ↵David Majnemer2015-06-261-36/+28
| | | | | | | | | | | X86WindowsTargetObjectFile::getSectionForConstant"" This reverts commit r240793 while fixing how we handle array constant pool entries. This fixes PR23966. llvm-svn: 240811
* [X86]: Correctly sign-extend 16-bit immediate in CALL instruction.Douglas Katzman2015-06-261-1/+7
| | | | | | | | Patch by Matthew Barney. Thanks! Differential Revision: http://reviews.llvm.org/D9514 llvm-svn: 240795
* Revert r240762 "[X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstant"Hans Wennborg2015-06-261-25/+37
| | | | | | It seems to have caused PR23966: "UNREACHABLE executed at ..\lib\Target\X86\X86TargetObjectFile.cpp:148" llvm-svn: 240793
* Rename getObjectFile to getObject for consistency.Rafael Espindola2015-06-262-2/+2
| | | | llvm-svn: 240785
* [X86] Cleanup X86WindowsTargetObjectFile::getSectionForConstantDavid Majnemer2015-06-261-37/+25
| | | | | | No functionality changed, just keeping things clean. llvm-svn: 240762
* Revert "X86: Reject register operands with obvious type mismatches."Matthias Braun2015-06-261-13/+0
| | | | | | | | Revert until http://llvm.org/PR23955 is investigated. This reverts commit r239309. llvm-svn: 240746
* Add an ELFSymbolRef type.Rafael Espindola2015-06-251-2/+2
| | | | | | | This allows user code to say Sym.getSize() instead of having to manually fetch the object. llvm-svn: 240708
* [X86] Accept hasAVX512() as well as hasFMA() when generating FMA.Ahmed Bougacha2015-06-251-3/+5
| | | | | | | | | | We don't always have FMA, for example when using 'clang -mavx512f' without an explicit CPU. Also check for an explicit +avx512f instead of CPUs in a couple related tests. llvm-svn: 240616
* Enable StackMap Serialization for COFFSwaroop Sridhar2015-06-251-0/+2
| | | | | | | | | | | | | | | | | | Summary This change turns on the emission of __LLVM_Stackmaps section when generating COFF binaries. Test Plan Added a scenario to the test case: test\CodeGen\X86\statepoint-stackmap-format.ll. Code Review: http://reviews.llvm.org/D10680 llvm-svn: 240613
* [X86] Simplify some stuff in X86DisassemblerDecoder. NFCDouglas Katzman2015-06-241-22/+17
| | | | | | | | | | | | | | - Deciding that insn->sibIndex is SIB_INDEX_NONE does not require another check beyond the fully decoded bits being equal to 0x4. The expression insn->sibIndex == SIB_INDEX_sib could not have been true unless index were 0x4, because SIB_INDEX_sib is merely the range base (SIB_INDEX_EAX) plus 4. Respectively SIB_INDEX_sib64. - Don't use a switch statement to perform left-shift. Differential Revision: http://reviews.llvm.org/D9762 llvm-svn: 240598
* Change how symbol sizes are handled in lib/Object.Rafael Espindola2015-06-241-1/+1
| | | | | | | | | | | | | | COFF and MachO only define symbol sizes for common symbols. Reflect that in the class hierarchy by having a method for common symbols only in the base and a general one in ELF. This avoids the need of using a magic value for the size, which had a few problems * Most callers didn't check for it. * The ones that did could not tell the magic value from a file actually having that value. llvm-svn: 240529
* [X86] Don't generate vbroadcasti128 for v4i64 splats from memory.Ahmed Bougacha2015-06-241-4/+5
| | | | | | | | | | | | | | | | | | | | | We used to erroneously match: (v4i64 shuffle (v2i64 load), <0,0,0,0>) Whereas vbroadcasti128 is more like: (v4i64 shuffle (v2i64 load), <0,1,0,1>) This problem doesn't exist for vbroadcastf128, which kept matching the intrinsic after r231182. We should perhaps re-introduce the intrinsic here as well, but that's a separate issue still being discussed. While there, add some proper vbroadcastf128 tests. We don't currently match those, like for loading vbroadcastsd/ss on AVX (the reg-reg broadcasts where added in AVX2). Fixes PR23886. llvm-svn: 240488
OpenPOWER on IntegriCloud