summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix X86 codegen for 'atomicrmw nand' to generate *x = ~(*x & y), not *x = ↵Richard Smith2012-04-132-27/+33
| | | | | | ~*x & y. llvm-svn: 154705
* Generalize r153635 to deal with TokenFactor chains; also clean up the logic ↵Evan Cheng2012-04-121-41/+51
| | | | | | and fix the tests. rdar://11069732, rdar://11236106 llvm-svn: 154604
* Fix 128-bit ptest intrinsics to take v2i64 instead of v4f32 since these are ↵Craig Topper2012-04-121-4/+4
| | | | | | integer instructions. llvm-svn: 154580
* remove unused argumentNadav Rotem2012-04-111-2/+2
| | | | llvm-svn: 154494
* Reapply 154396 after fixing a test.Nadav Rotem2012-04-114-36/+87
| | | | | | | | | Original message: Modify the code that lowers shuffles to blends from using blendvXX to vblendXX. blendV uses a register for the selection while Vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154483
* Add retw and lretw instructions. Also, fix Intel syntax parsing for allCharles Davis2012-04-111-5/+10
| | | | | | ret instructions. llvm-svn: 154468
* Whitespace.Chad Rosier2012-04-101-1/+0
| | | | llvm-svn: 154427
* Revert r154396, which looks to be the real culprit behind the bot failures.Chad Rosier2012-04-101-0/+1
| | | | llvm-svn: 154426
* Temporarily revert this patch to see if it brings the buildbots back.Eric Christopher2012-04-104-87/+36
| | | | llvm-svn: 154425
* Remove unused variable.David Blaikie2012-04-101-1/+0
| | | | llvm-svn: 154398
* Modify the code that lowers shuffles to blends from using blendvXX to vblendXX.Nadav Rotem2012-04-104-35/+87
| | | | | | | blendv uses a register for the selection while vblend uses an immediate. On sandybridge they still have the same latency and execute on the same execution ports. llvm-svn: 154396
* Fix a long standing tail call optimization bug. When a libcall is emittedEvan Cheng2012-04-102-3/+9
| | | | | | | | | | | | | legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370
* This patch adds X86 instruction itineraries, which were missed by thePreston Gurd2012-04-091-28/+30
| | | | | | original patch to add itineraries, to X86InstrArithmetc.td. llvm-svn: 154320
* Lower some x86 shuffle sequences to the vblend family of instructions.Nadav Rotem2012-04-091-0/+67
| | | | llvm-svn: 154313
* Fix a bug in the lowering of broadcasts: ConstantPools need to use the ↵Nadav Rotem2012-04-092-19/+13
| | | | | | | | target pointer type. Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering. llvm-svn: 154310
* Cleanup and relax a restriction on the matching of global offsets intoChandler Carruth2012-04-091-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x86 addressing modes. This allows PIE-based TLS offsets to fit directly into an addressing mode immediate offset, which is the last remaining code quality issue from PR12380. With this patch, that PR is completely fixed. To understand why this patch is correct to match these offsets into addressing mode immediates, break it down by cases: 1) 32-bit is trivially correct, and unmodified here. 2) 64-bit non-small mode is unchanged and never matches. 3) 64-bit small PIC code which is RIP-relative is handled specially in the match to try to fit RIP into the base register. If it fails, it now early exits. This behavior is unchanged by the patch. 4) 64-bit small non-PIC code which is not RIP-relative continues to work as it did before. The reason these immediates are safe is because the ABI ensures they fit in small mode. This behavior is unchanged. 5) 64-bit small PIC code which is *not* using RIP-relative addressing. This is the only case changed by the patch, and the primary place you see it is in TLS, either the win64 section offset TLS or Linux local-exec TLS model in a PIC compilation. Here the ABI again ensures that the immediates fit because we are in small mode, and any other operations required due to the PIC relocation model have been handled externally to the Wrapper node (extra loads etc are made around the wrapper node in ISelLowering). I've tested this as much as I can comparing it with GCC's output, and everything appears safe. I discussed this with Anton and it made sense to him at least at face value. That said, if there are issues with PIC code after this patch, yell and we can revert it. llvm-svn: 154304
* Move the TLSModel information into the TargetMachine rather than hidingChandler Carruth2012-04-081-2/+1
| | | | | | | | in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. llvm-svn: 154292
* AVX2: Build splat vectors by broadcasting a scalar from the constant pool.Nadav Rotem2012-04-081-28/+68
| | | | | | | | Previously we used three instructions to broadcast an immediate value into a vector register. On Sandybridge we continue to load the broadcasted value from the constant pool. llvm-svn: 154284
* Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and ↵Craig Topper2012-04-071-5/+4
| | | | | | remove patterns for selecting the intrinsic. Similar was already done for avx1. llvm-svn: 154272
* Move vinsertf128 patterns near the instruction definitions. Add ↵Craig Topper2012-04-071-42/+41
| | | | | | AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns. llvm-svn: 154268
* Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming.NAKAMURA Takumi2012-04-071-0/+3
| | | | | | | Cygwin-1.7 supports dw2. Some recent mingw distros support one, too. I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin. llvm-svn: 154247
* Fix narrowing conversion.Benjamin Kramer2012-04-061-1/+1
| | | | llvm-svn: 154171
* Allow 256-bit shuffles to be split if a 128-bit lane contains elements from ↵Craig Topper2012-04-061-72/+55
| | | | | | a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413. llvm-svn: 154166
* Always compute all the bits in ComputeMaskedBits.Rafael Espindola2012-04-044-14/+9
| | | | | | | | This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011
* Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.Craig Topper2012-04-035-41/+49
| | | | llvm-svn: 153935
* Move getOpcodeName from the various target InstPrinters into the superclass ↵Benjamin Kramer2012-04-024-9/+0
| | | | | | | | MCInstPrinter. All implementations used the same code. llvm-svn: 153866
* Remove getInstructionName from MCInstPrinter implementations in favor of ↵Craig Topper2012-04-024-8/+4
| | | | | | using the instruction name table from MCInstrInfo. Reduces static data in the InstPrinter implementations. llvm-svn: 153863
* Make MCInstrInfo available to the MCInstPrinter. This will be used to remove ↵Craig Topper2012-04-023-6/+9
| | | | | | getInstructionName and the static data it contains since the same tables are already in MCInstrInfo. llvm-svn: 153860
* This commit contains a few changes that had to go in together.Nadav Rotem2012-04-011-8/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848
* Rip out emission of the regIsInRegClass function for the asm printer.Benjamin Kramer2012-03-301-0/+1
| | | | | | It's slow, bloated and completely redundant with MCRegisterClass::contains. llvm-svn: 153782
* Add a note about a missed cmov -> sbb opportunity.Benjamin Kramer2012-03-301-0/+18
| | | | llvm-svn: 153741
* Make x86 REP_MOV* and REP_STO instructions use the correct operand sizes in ↵Lang Hames2012-03-291-23/+56
| | | | | | 64-bit mode. llvm-svn: 153680
* Replace assert(0) with llvm_unreachable to avoid warnings about dropping off ↵Benjamin Kramer2012-03-291-6/+5
| | | | | | the end of a non-void function in Release builds. llvm-svn: 153643
* Only allow symbolic names for (v)cmpss/sd/ps/pd encodings 8-31 to be used ↵Craig Topper2012-03-291-12/+13
| | | | | | with 'v' version of instructions. llvm-svn: 153636
* For X86, change load/dec-or-inc/store into dec-or-inc, respectively.Joel Jones2012-03-291-34/+94
| | | | | | | | | | | | | | | | | This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153635
* Reverted to revision 153616 to unblock buildJoel Jones2012-03-291-94/+34
| | | | llvm-svn: 153623
* For X86, change load/dec-or-inc/store into dec-or-inc, respectively.Joel Jones2012-03-291-34/+94
| | | | | | | | | | | | | | | | | This is a code change to add support for changing instruction sequences of the form: load inc/dec of 8/16/32/64 bits store into the appropriate X86 inc/dec through memory instruction: inc[qlwb] / dec[qlwb] The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode. The comments have also been expanded. llvm-svn: 153617
* Prune some includesCraig Topper2012-03-276-6/+0
| | | | llvm-svn: 153502
* Remove unnecessary llvm:: qualificationsCraig Topper2012-03-273-3/+3
| | | | llvm-svn: 153500
* Put Is64BitMemOperand into !defined(NDEBUG) for now.Joerg Sonnenberger2012-03-211-0/+2
| | | | llvm-svn: 153185
* Use a signed value for this enum to avoid spuriuos warnings from gcc.Benjamin Kramer2012-03-212-2/+2
| | | | llvm-svn: 153184
* Fix generation of the address size override prefix. Add assertions forJoerg Sonnenberger2012-03-211-5/+51
| | | | | | | the invalid cases. At least 16bit operand in 64bit mode is currently not rejected in the parser. llvm-svn: 153166
* Spacing fixes and using 'unsigned' instead of 'int' to index to select ↵Craig Topper2012-03-211-28/+29
| | | | | | shuffle elements for consistency with other shuffle code in X86 backend. llvm-svn: 153154
* [avx] Add patterns for combining vextractf128 + vmovaps/vmovups/vmobdqu toChad Rosier2012-03-201-0/+17
| | | | | | | | | | | | | | | | | vextractf128 with 128-bit mem dest. Combines vextractf128 $0, %ymm0, %xmm0 vmovaps %xmm0, (%rdi) to vextractf128 $0, %ymm0, (%rdi) rdar://11082570 llvm-svn: 153139
* [avx] Add the AddedComplexity to the VINSERTI128 avx2 patterns to giveChad Rosier2012-03-201-1/+1
| | | | | | precedence over the VINSERTF128 avx1 patterns. llvm-svn: 153114
* Whitespace.Chad Rosier2012-03-201-1/+1
| | | | llvm-svn: 153105
* [avx] Move the vextractf128 patterns closer to the vextractf128 def. RemoveChad Rosier2012-03-201-28/+26
| | | | | | whitespace from test case. No functional change intended. llvm-svn: 153103
* [avx] Adjust the VINSERTF128rm pattern to allow for unaligned loads.Chad Rosier2012-03-201-3/+3
| | | | | | | | | | | | | | | This results in things such as vmovups 16(%rdi), %xmm0 vinsertf128 $1, %xmm0, %ymm0, %ymm0 to be combined to vinsertf128 $1, 16(%rdi), %ymm0, %ymm0 rdar://11076953 llvm-svn: 153092
* Remove code that prevented lowering shuffles if they are used by load and ↵Craig Topper2012-03-201-92/+111
| | | | | | themselves used by a extract_vector_elt. This was done to allow the DAG combiner to collapse to a single element load. Unfortunately, sometimes the extract_vector_elt would disappear before DAG combine could do the transformation leaving a vector_shuffle that isel couldn't handle. New code lets the shuffle be converted to a target specific node, but then adds a combine routine that can convert target specific nodes back to vector_shuffles if the folding criteria are met. llvm-svn: 153080
* Factor out target shuffle mask decoding from getShuffleScalarElt and use a ↵Craig Topper2012-03-204-96/+95
| | | | | | SmallVector of int instead of unsigned for shuffle mask in decode functions. Preparation for another change. llvm-svn: 153079
OpenPOWER on IntegriCloud