summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* Fix PR20087 by using the source index when changing the vector loadFilipe Cabecinhas2014-06-221-2/+2
| | | | llvm-svn: 211472
* [X86] Add ISel patterns to select SSE3/AVX ADDSUB instructions.Andrea Di Biagio2014-06-211-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds ISel patterns to select SSE3/AVX ADDSUB instructions from a sequence of "vadd + vsub + blend". Example: /// typedef float float4 __attribute__((ext_vector_type(4))); float4 foo(float4 A, float4 B) { float4 X = A - B; float4 Y = A + B; return (float4){X[0], Y[1], X[2], Y[3]}; } /// Before this patch, (with flag -mcpu=corei7) llc produced the following assembly sequence: movaps %xmm0, %xmm2 addps %xmm1, %xmm2 subps %xmm1, %xmm0 blendps $10, %xmm2, %xmm0 With this patch, we now get a single addsubps %xmm1, %xmm0 llvm-svn: 211427
* Delete dead code.Rafael Espindola2014-06-201-17/+8
| | | | | | The compact unwind info is only used by code that knows it is supported. llvm-svn: 211412
* Don't produce eh_frame relocations when targeting the IOS simulator.Rafael Espindola2014-06-201-2/+3
| | | | | | First step for fixing pr19185. llvm-svn: 211404
* Generate native unwind info on Win64Reid Kleckner2014-06-206-117/+325
| | | | | | | | | | | | | | | | | | | | This patch enables LLVM to emit Win64-native unwind info rather than DWARF CFI. It handles all corner cases (I hope), including stack realignment. Because the unwind info is not flexible enough to describe stack frames with a gap of unknown size in the middle, such as the one caused by stack realignment, I modified register spilling code to place all spills into the fixed frame slots, so that they can be accessed relative to the frame pointer. Patch by Vadim Chugunov! Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D4081 llvm-svn: 211399
* Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.Karthik Bhat2014-06-201-8/+38
| | | | | | | | | This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and vectorizes them as vector shuffles if they are profitable. These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86. Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 llvm-svn: 211339
* [x86] Make the x86 PACKSSWB, PACKSSDW, PACKUSWB, and PACKUSDWChandler Carruth2014-06-204-21/+152
| | | | | | | | | | | | | | | instructions available as synthetic SDNodes PACKSS and PACKUS that will select to the correct instruction variants based on the return type. This allows us to use these rather important instructions when lowering vector shuffles. Also moves the relevant instruction definitions to be split out from the fully generic multiclasses to allow them to match these new SDNodes in the same way that the UNPCK instructions do. No functionality should actually be changed here. llvm-svn: 211332
* Fix typosAlp Toker2014-06-191-1/+1
| | | | llvm-svn: 211304
* [X86] Teach how to combine horizontal binop even in the presence of undefs.Andrea Di Biagio2014-06-191-40/+115
| | | | | | | | | | | | | | Before this change, the backend was unable to fold a build_vector dag node with UNDEF operands into a single horizontal add/sub. This patch teaches how to combine a build_vector with UNDEF operands into a horizontal add/sub when possible. The algorithm conservatively avoids to combine a build_vector with only a single non-UNDEF operand. Added test haddsub-undef.ll to verify that we correctly fold horizontal binop even in the presence of UNDEFs. llvm-svn: 211265
* MS asm: Properly handle quoted symbol namesDavid Majnemer2014-06-191-2/+4
| | | | | | | | | | | | | We would get confused by '@' characters in symbol names, we would mistake the text following them for the variant kind. When an identifier a string, the variant kind will never show up inside of it. Instead, check to see if there is a variant following the string. This fixes PR19965. llvm-svn: 211249
* [X86] AVX512: Add non-temporal storesAdam Nemet2014-06-181-0/+29
| | | | | | | | | | | Note that I followed the AVX2 convention here and didn't add LLVM intrinsics for stores. These can be generated with the nontemporal hint on LLVM IR stores (see new test). The GCC builtins are lowered directly into nontemporal stores. <rdar://problem/17082571> llvm-svn: 211176
* [X86] AVX512: Specify compressed displacement for vmovntdqaAdam Nemet2014-06-181-1/+1
| | | | | | | Use the max 64-bit element size with EVEX_CD8. This should work since element size is ignored for a full-vector access (FVM). llvm-svn: 211175
* Add pattern for unsigned v4i32->v4f64 convert on AVX512.Cameron McInally2014-06-181-0/+4
| | | | llvm-svn: 211164
* Allow X86FastIsel to cope with 64 bit absolute relocationsLouis Gerbarg2014-06-171-10/+12
| | | | | | | | | | | | This patch is a follow up to r211040 & r211052. Rather than bailing out of fast isel this patch will generate an alternate instruction (movabsq) instead of the leaq. While this will always have enough room to handle the 64 bit displacment it is generally over kill for internal symbols (most displacements will be within 32 bits) but since we have no way of communicating the code model to the the assmebler in order to avoid flagging an absolute leal/leaq as illegal when using a symbolic displacement. llvm-svn: 211130
* [FastISel][X86] Optimize predicates and fold CMP instructions.Juergen Ributzka2014-06-171-13/+109
| | | | | | | | | This optimizes predicates for certain compares, such as fcmp oeq %x, %x to fcmp ord %x, %x. The latter one is more efficient to generate. The same optimization is applied to conditional branches. llvm-svn: 211126
* [FastISel][X86] Fix previous refactoring commit (r211077)Juergen Ributzka2014-06-171-4/+4
| | | | | | | Overlooked that fcmp_une uses an "or" instead of an "and" for combining the flags. llvm-svn: 211104
* [FastISel][X86] Refactor the code to get the X86 condition from a helper ↵Juergen Ributzka2014-06-163-96/+110
| | | | | | | | | function. NFC. Make use of helper functions to simplify the branch and compare instruction selection in FastISel. Also add test cases for compare and conditonal branch. llvm-svn: 211077
* Improve comments for r211040Louis Gerbarg2014-06-161-1/+4
| | | | | | | | Added comment to clarify why we r211040 choose to bail out of fast isel instead of generating a more complicated relocation, and fix mislabelled register in the comments of the asan test case. llvm-svn: 211052
* Fix illegal relocations in X86FastISelLouis Gerbarg2014-06-161-0/+4
| | | | | | | | | | | | On x86_86 the lea instruction can only use a 32 bit immediate value. When the code is compiled statically the RIP register is not used, meaning the immediate is all that can be used for the relocation, which is not sufficient in the case of targets more than +/- 2GB away. This patch bails out of fast isel in those cases and reverts to DAG which does the right thing. Test case included. llvm-svn: 211040
* Hook up vector int_ctlz for AVX512.Cameron McInally2014-06-162-0/+14
| | | | llvm-svn: 211024
* X86: lower ATOMIC_CMP_SWAP_WITH_SUCCESS directlyTim Northover2014-06-131-12/+31
| | | | | | | | | | | Lowering this new node allows us to fold the almost universal comparison for success before it's even formed. Instead we can create a copy from EFLAGS and an X86ISD::SETCC operation since all "cmpxchg" instructions set the zero-flag to the correct value. rdar://problem/13201607 llvm-svn: 210923
* IR: add "cmpxchg weak" variant to support permitted failure.Tim Northover2014-06-131-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a weak variant of the cmpxchg operation, as described in C++11. A cmpxchg instruction with this modifier is permitted to fail to store, even if the comparison indicated it should. As a result, cmpxchg instructions must return a flag indicating success in addition to their original iN value loaded. Thus, for uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The second flag is 1 when the store succeeded. At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been added as the natural representation for the new cmpxchg instructions. It is a strong cmpxchg. By default this gets Expanded to the existing ATOMIC_CMP_SWAP during Legalization, so existing backends should see no change in behaviour. If they wish to deal with the enhanced node instead, they can call setOperationAction on it. Beware: as a node with 2 results, it cannot be selected from TableGen. Currently, no use is made of the extra information provided in this patch. Test updates are almost entirely adapting the input IR to the new scheme. Summary for out of tree users: ------------------------------ + Legacy Bitcode files are upgraded during read. + Legacy assembly IR files will be invalid. + Front-ends must adapt to different type for "cmpxchg". + Backends should be unaffected by default. llvm-svn: 210903
* Add HasCDI predicate to AVX512 VPBROADCASTM*.Cameron McInally2014-06-131-0/+2
| | | | llvm-svn: 210892
* [FastISel][X86] Add support for cvttss2si/cvttsd2si intrinsics.Juergen Ributzka2014-06-131-0/+66
| | | | | | | | This adds support for the cvttss2si/cvttsd2si intrinsics. Preceding insertelement instructions are folded into the conversion instruction (if possible). llvm-svn: 210870
* [FastISel][X86] - Add branch weightsJuergen Ributzka2014-06-131-3/+16
| | | | | | | Add branch weights to branch instructions, so that the following passes can optimize based on it (i.e. basic block ordering). llvm-svn: 210863
* [FastISel][X86] Add MachineMemOperand to load/store instructions.Juergen Ributzka2014-06-121-39/+68
| | | | | | | | This commit adds MachineMemOperands to load and store instructions. This allows the peephole optimizer to fold load instructions. Unfortunatelly the peephole optimizer currently doesn't run at -O0. llvm-svn: 210858
* [FastIsel][X86] Add support for lowering the first 8 floating-point arguments.Juergen Ributzka2014-06-121-20/+41
| | | | | | | Recommit with fixed argument attribute checking code, which is required to bail out of all the cases we don't handle yet. llvm-svn: 210815
* Revert "[FastIsel][X86] Add support for lowering the first 8 floating-point ↵Juergen Ributzka2014-06-121-36/+19
| | | | | | | | arguments." Reverting it because it breaks several tests. llvm-svn: 210810
* X86: stifle GCC warningSaleem Abdulrasool2014-06-121-1/+3
| | | | | | | | | | lib/Target/X86/X86TargetTransformInfo.cpp: In member function ‘virtual unsigned int {anonymous}::X86TTI::getIntImmCost(unsigned int, unsigned int, const llvm::APInt&, llvm::Type*) const’: lib/Target/X86/X86TargetTransformInfo.cpp:920:60: warning: enumeral and non-enumeral type in conditional expression [enabled by default] This seems like an unhelpful warning, but there doesnt seem to be a controlling flag, so add an explicit cast to silence the warning. llvm-svn: 210806
* [X86] Teach how to dump the name of target node RDTSCP_DAG.Andrea Di Biagio2014-06-121-0/+1
| | | | | | | | | When I originally added node RDTSCP_DAG (r207127) I forgot to add a string name for it in method 'getTargetNodeName'. No functional change intended. llvm-svn: 210769
* [X86] Teach how to combine AVX and AVX2 horizontal binop on packed 256-bit ↵Andrea Di Biagio2014-06-121-9/+103
| | | | | | | | | | | vectors. This patch adds target combine rules to match: - [AVX] Horizontal add/sub of packed single/double precision floating point values from 256-bit vectors; - [AVX2] Horizontal add/sub of packed integer values from 256-bit vectors. llvm-svn: 210761
* [FastISel][X86] Add support for the sqrt intrinsic.Juergen Ributzka2014-06-111-0/+52
| | | | llvm-svn: 210720
* [FastIsel][X86] Add support for lowering the first 8 floating-point arguments.Juergen Ributzka2014-06-111-19/+36
| | | | llvm-svn: 210719
* [FastISel][X86] Add support for the frameaddress intrinsic.Juergen Ributzka2014-06-111-0/+52
| | | | llvm-svn: 210709
* X86: add stringy name for X86ISD::LCMPXCHG16_DAGTim Northover2014-06-111-0/+1
| | | | | | | I don't know what "target specific node #383" is, and I don't want to have to. llvm-svn: 210663
* Add AVX512 masked leadz instrinsic support.Cameron McInally2014-06-111-0/+22
| | | | llvm-svn: 210652
* [X86] Refactor the logic to select horizontal adds/subs to a helper function.Andrea Di Biagio2014-06-111-90/+118
| | | | | | | | | | | | | | | This patch moves part of the logic implemented by the target specific combine rules added at r210477 to a separate helper function. This should make easier to add more rules for matching AVX/AVX2 horizontal adds/subs. This patch also fixes a problem caused by a wrong check performed on indices of extract_vector_elt dag nodes in input to the scalar adds/subs. New tests have been added to verify that we correctly check indices of extract_vector_elt dag nodes when selecting a horizontal operation. llvm-svn: 210644
* Move to a private function to initialize the subtarget dependenciesEric Christopher2014-06-112-22/+25
| | | | | | so that we can use initializer lists for the X86Subtarget. llvm-svn: 210614
* [FastISel][X86] Extend support for {s|u}{add|sub|mul}.with.overflow intrinsics.Juergen Ributzka2014-06-101-30/+89
| | | | llvm-svn: 210610
* Use unique_ptr for X86Subtarget pointer members.Eric Christopher2014-06-102-22/+16
| | | | llvm-svn: 210606
* Remove the use of TargetMachine from X86InstrInfo.Eric Christopher2014-06-103-61/+60
| | | | llvm-svn: 210596
* Move X86RegisterInfo away from using the TargetMachine and onlyEric Christopher2014-06-105-33/+30
| | | | | | using the subtarget. llvm-svn: 210595
* Use the TargetMachine on the DAG or the MachineFunction insteadEric Christopher2014-06-101-82/+80
| | | | | | of using the cached TargetMachine. llvm-svn: 210589
* Add a FIXME.Eric Christopher2014-06-101-0/+2
| | | | llvm-svn: 210559
* [X86] Improved target combine rules for selecting horizontal add/sub.Andrea Di Biagio2014-06-101-2/+20
| | | | | | | | | | | | | This patch slightly changes the algorithm introduced at revision 210477 to fix a problem where the algorithm was producing incorrect code for the VEX.256 encoded versions of horizontal add/sub. For these cases, we now try to split the two 256-bit vectors into 128-bit chunks before emitting horizontal add/sub dag nodes. Added a new test case into haddsub-2.ll. llvm-svn: 210545
* [X86] AVX512: Add vmovntdqaAdam Nemet2014-06-101-0/+11
| | | | | | Along with the corresponding intrinsic and tests. llvm-svn: 210543
* SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CCTom Stellard2014-06-101-1/+8
| | | | | | | | | | | | | The SelectionDAG bad a special case for ISD::SELECT_CC, where it would allow targets to specify: setOperationAction(ISD::SELECT_CC, MVT::Other, Expand); to indicate that they wanted to expand ISD::SELECT_CC for all types. This wasn't applied correctly everywhere, and it makes writing new DAG patterns with ISD::SELECT_CC difficult. llvm-svn: 210541
* Revert "X86: elide comparisons after cmpxchg instructions."Tim Northover2014-06-102-106/+57
| | | | | | | This reverts commit r210523. It was committed prematurely without waiting for review. llvm-svn: 210524
* X86: elide comparisons after cmpxchg instructions.Tim Northover2014-06-102-57/+106
| | | | | | | | | | | | | | | The C++ and C semantics of the compare_and_swap operations actually require us to return a boolean "success" value. In LLVM terms this means a second comparison of the output of "cmpxchg" against the input desired value. However, x86's "cmpxchg" instruction sets all flags for the comparison formed, so we can skip any secondary comparison. (N.b. this isn't true for cmpxchg8b/16b, which only set ZF). rdar://problem/13201607 llvm-svn: 210523
* Delete X86JITInfo in the subtarget destructor.Eric Christopher2014-06-101-0/+1
| | | | llvm-svn: 210516
OpenPOWER on IntegriCloud