summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* Move all of the x86 subtarget initialized variables down into the x86 subtargetEric Christopher2014-06-097-67/+95
| | | | | | from the x86 target machine. Should be no functional change. llvm-svn: 210479
* [X86] Add target combine rules for horizontal add/sub.Andrea Di Biagio2014-06-091-0/+85
| | | | | | | | | | | | | | | | | | | | This patch adds new target specific combine rules to identify horizontal add/sub idioms from BUILD_VECTOR dag nodes. This patch also teaches the DAGCombiner how to canonicalize sequences of insert_vector_elt dag nodes according to the following rule: (insert_vector_elt (insert_vector_elt A, I0), I1) -> (insert_vecto_elt (insert_vector_elt A, I1), I0) This new canonicalization rule only triggers if the inner insert_vector dag node has exactly one use; also, both indices must be known constants, and I1 < I0. This last rule made it possible to write a simpler algorithm to identify horizontal add/sub patterns because now we don't have to worry about the ordering of insert_vector_elt dag nodes. llvm-svn: 210477
* [X86] Avoid emitting unnecessary test instructions.Andrea Di Biagio2014-06-091-2/+19
| | | | | | | | | | | | | This patch teaches the backend how to check for the 'NoSignedWrap' flag on binary operations to improve the emission of 'test' instructions. If the result of a binary operation is known not to overflow we know that resetting the Overflow flag is unnecessary and so we can avoid emitting the test instruction. Patch by Marcello Maggioni. llvm-svn: 210468
* [X86] Use ADD/SUB instead of INC/DEC for SilvermontAlexey Volkov2014-06-096-15/+37
| | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont INC or DEC instructions require an additional uop to merge the flags. As a result, a branch instruction depending on an INC or a DEC instruction incurs a 1 cycle penalty. Differential Revision: http://reviews.llvm.org/D3990 llvm-svn: 210466
* [C++11] Use 'nullptr'.Craig Topper2014-06-082-2/+2
| | | | llvm-svn: 210442
* X86: simplify data layout calculationSaleem Abdulrasool2014-06-081-3/+2
| | | | | | | | X86Subtarget::isTargetCygMing || X86Subtarget::isTargetKnownWindowsMSVC is equivalent to all Windows environments. Simplify the check to isOSWindows. NFC. llvm-svn: 210431
* AsmMatchers: Use unique_ptr to manage ownership of MCParsedAsmOperandDavid Blaikie2014-06-084-183/+174
| | | | | | | | | | | | I saw at least a memory leak or two from inspection (on probably untested error paths) and r206991, which was the original inspiration for this change. I ran this idea by Jim Grosbach a few weeks ago & he was OK with it. Since it's a basically mechanical patch that seemed sufficient - usual post-commit review, revert, etc, as needed. llvm-svn: 210427
* Replace the use of TargetMachine with a tiny bool variable.Eric Christopher2014-06-063-8/+6
| | | | llvm-svn: 210386
* Remove all local variables from X86SelectionDAGInfo, the DAG hasEric Christopher2014-06-063-35/+29
| | | | | | all of the ones we were stashing away on startup. llvm-svn: 210385
* X86: Don't turn shifts into ands if there's another use that may not check ↵Benjamin Kramer2014-06-061-1/+1
| | | | | | | | for equality. Fixes PR19964. llvm-svn: 210371
* Have TargetSelectionDAGInfo take a DataLayout initializer rather thanEric Christopher2014-06-061-1/+1
| | | | | | a TargetMachine since the only thing it wants is DataLayout. llvm-svn: 210366
* Fixed a bug in lowering shuffle_vectors to insertpsFilipe Cabecinhas2014-06-061-9/+20
| | | | | | | | | | | | | | Summary: We were being too strict and not accounting for undefs. Added a test case and fixed another one where we improved codegen. Reviewers: grosbach, nadav, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4039 llvm-svn: 210361
* Remove X86Subtarget from the X86FrameLowering constructor sinceEric Christopher2014-06-052-15/+11
| | | | | | | we can just pass in the values we already know and we're not caching the subtarget anymore. llvm-svn: 210292
* Remove caching of the subtarget for X86FrameLowering.Eric Christopher2014-06-052-6/+9
| | | | llvm-svn: 210290
* Remove duplicate copy of InstrItineraryData from the TargetMachine,Eric Christopher2014-06-052-3/+1
| | | | | | it's already on the subtarget. llvm-svn: 210289
* Add a new attribute called 'jumptable' that creates jump-instruction tables ↵Tom Roeder2014-06-052-0/+17
| | | | | | | | | | | | for functions marked with this attribute. It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables. This also adds backend support for generating the jump-instruction tables on ARM and X86. Note that since the jumptable attribute creates a second function pointer for a function, any function marked with jumptable must also be marked with unnamed_addr. llvm-svn: 210280
* We've got a getSlotSize call already that we use everywhere else,Eric Christopher2014-06-051-2/+3
| | | | | | use it here too. llvm-svn: 210227
* 80-columns.Eric Christopher2014-06-051-1/+2
| | | | llvm-svn: 210224
* Remove uses of the TargetMachine from X86FrameLowering.Eric Christopher2014-06-053-19/+25
| | | | llvm-svn: 210223
* Two small enhancements for the JIT.Yaron Keren2014-06-041-1/+6
| | | | | | | | | | When JITting a large project such as Boost it's quite hard to figure out the problematic inline asm without debug location. This patch provides debug location printout before the JIT aborts due to inline asm. printDebugLoc() was exposed from MachineInstr.cpp and reused here. If the JIT run with debug info, don't bomb on DBG_VALUE but ignore them. http://reviews.llvm.org/D3416 llvm-svn: 210201
* Fix a use of uninitialized value. OldCC is set when IsCmpZero || IsSwapped ↵Nick Lewycky2014-06-041-1/+1
| | | | | | and read when ShouldUpdateCC || IsSwapped, and ShouldUpdateCC is independent. Fixes PR19932, but no test since I wasn't able to get any symptoms to appear, not even with valgrind and the testcase from the PR. It's clear what happened from inspection of the code. llvm-svn: 210168
* Revert r209381 as it isn't a local variable. Add a testcase so thatEric Christopher2014-06-031-0/+1
| | | | | | we know next time this happens. llvm-svn: 210127
* Fixup formatting in the pass.Eric Christopher2014-06-031-86/+86
| | | | llvm-svn: 210126
* [X86] Fix checked arithmetic for i8 on X86.Andrea Di Biagio2014-06-021-2/+3
| | | | | | | | | | | When lowering a ISD::BRCOND into a test+branch, make sure that we always use the correct condition code to emit the test operation. This fixes PR19858: "i8 checked mul is wrong on x86". Patch by Keno Fisher! llvm-svn: 210032
* Have the TLOF creation take a Triple rather than needing a subtarget.Eric Christopher2014-05-311-11/+8
| | | | llvm-svn: 209937
* [X86] Add two combine rules to simplify dag nodes introduced during type ↵Andrea Di Biagio2014-05-301-0/+53
| | | | | | | | | | | | | | | | | | | | | | | legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. llvm-svn: 209930
* Separate the check for blend shuffle_vector masksFilipe Cabecinhas2014-05-301-25/+42
| | | | | | | | | | | | | | | | | Summary: Separate the check for blend shuffle_vector masks into isBlendMask. This function will also be used to check if a vector shuffle is legal. No change in functionality was intended, but we ended up improving codegen on two tests, which were being (more) optimized only if the resulting shuffle was legal. Reviewers: nadav, delena, andreadb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3964 llvm-svn: 209923
* [X86] Remove AVX1 vbroadcast intrinsicsAdam Nemet2014-05-291-15/+17
| | | | | | | | | | | | | | | | | | | | | The corresponding CFE patch replaces these intrinsics with vector initializers in avxintrin.h. This patch removes the LLVM intrinsics from the backend. We now stop lowering at X86ISD::VBROADCAST custom node rather than lowering that further to the intrinsics. The patch only changes VBROADCASTS* and leaves VBROADCAST[FI]128 to continue to use intrinsics. As explained in the CFE patch, the reason is that we currently don't generate as good code for them without the intrinsics. CodeGen/X86/avx-vbroadcast.ll already provides coverage for this change. It checks that for a series of insertelements we generate the appropriate vbroadcast instruction. Also verified that there was no assembly change in the test-suite before and after this patch. llvm-svn: 209864
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-11/+2
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Emit data or code export directives based on the type.Rafael Espindola2014-05-251-7/+3
| | | | | | | | | | | | | | | | | | | | | | Currently we look at the Aliasee to decide what type of export directive to use. It seems better to use the type of the alias directly. This is similar to how we handle the alias having the same address but other attributes (linkage, visibility) from the aliasee. With this patch it is now possible to do things like target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc" @foo = global [6 x i8] c"\B8*\00\00\00\C3", section ".text", align 16 @f = dllexport alias i32 (), [6 x i8]* @foo !llvm.module.flags = !{!0} !0 = metadata !{i32 6, metadata !"Linker Options", metadata !1} !1 = metadata !{metadata !2, metadata !3} !2 = metadata !{metadata !"/DEFAULTLIB:libcmt.lib"} !3 = metadata !{metadata !"/DEFAULTLIB:oldnames.lib"} llvm-svn: 209600
* Delete dead code.Rafael Espindola2014-05-231-4/+0
| | | | | | GV is never used past this point. This was probably a copy and paste error. llvm-svn: 209518
* [X86] Improve the lowering of BITCAST from MVT::f64 to MVT::v4i16/MVT::v8i8.Andrea Di Biagio2014-05-221-18/+38
| | | | | | | | | | | | | This patch teaches the x86 backend how to efficiently lower ISD::BITCAST dag nodes from MVT::f64 to MVT::v4i16 (and vice versa), and from MVT::f64 to MVT::v8i8 (and vice versa). This patch extends the logic from revision 208107 to also handle MVT::v4i16 and MVT::v8i8. Also, this patch correctly propagates Undef values when performing the widening of a vector (example: when widening from v2i32 to v4i32, the upper 64bits of the resulting vector are 'undef'). llvm-svn: 209451
* Segmented stacks: omit __morestack call when there's no frame.Tim Northover2014-05-221-5/+9
| | | | | | Patch by Florian Zeitz llvm-svn: 209436
* Override runOnMachineFunction for X86ISelDAGToDAG so that we canEric Christopher2014-05-221-0/+7
| | | | | | reset the subtarget on each function. llvm-svn: 209384
* Avoid using subtarget features when adding X86 specific passes toEric Christopher2014-05-225-14/+17
| | | | | | the pass pipeline. llvm-svn: 209382
* Remove extra local variable.Eric Christopher2014-05-221-2/+1
| | | | llvm-svn: 209381
* Rename createGlobalBaseRegPass -> createX86GlobalBaseRegPass to makeEric Christopher2014-05-223-4/+4
| | | | | | it obvious that it's a target specific pass. llvm-svn: 209380
* Fix typo.Eric Christopher2014-05-221-1/+1
| | | | llvm-svn: 209377
* Fix compilation issues.Eric Christopher2014-05-211-2/+3
| | | | llvm-svn: 209342
* Make early if conversion dependent upon the subtarget and addEric Christopher2014-05-213-11/+16
| | | | | | | a subtarget hook to enable. Unconditionally add to the pass pipeline for targets that might want to use it. No functional change. llvm-svn: 209340
* [X86] Fix a bug in the lowering of BLENDI introduced in r209043.Quentin Colombet2014-05-211-3/+7
| | | | | | | | | | | | | | | | | | | ISD::VSELECT mask uses 1 to identify the first argument and 0 to identify the second argument. On the other hand, BLENDI uses 0 to identify the first argument and 1 to identify the second argument. Fix the generation of the blend mask to account for this difference. The bug did not show up with r209043, because we were not checking for the actual arguments of the blend instruction! This commit also fixes the test cases. Note: The same mask works for the BLENDr variant because the arguments are swapped during instruction selection (see the BLENDXXrr patterns). <rdar://problem/16975435> llvm-svn: 209324
* [asan] Fix x86-32 asm instrumentation to preserve flags.Evgeniy Stepanov2014-05-211-2/+1
| | | | | | Patch by Yuri Gorshenin. llvm-svn: 209280
* Add parentheses to suppress the gcc warning '-Wparentheses'.Simon Atanasyan2014-05-201-2/+2
| | | | | | No functional changes. llvm-svn: 209203
* [X86] Tune LEA usage for SilvermontAlexey Volkov2014-05-207-14/+102
| | | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont in some cases LEA is better to be replaced with ADD instructions: "The rule of thumb for ADDs and LEAs is that it is justified to use LEA with a valid index and/or displacement for non-destructive destination purposes (especially useful for stack offset cases), or to use a SCALE. Otherwise, ADD(s) are preferable." Differential Revision: http://reviews.llvm.org/D3826 llvm-svn: 209198
* [ConstantHoisting][X86] Change the cost model to never hoist constants for ↵Juergen Ributzka2014-05-191-2/+13
| | | | | | | | | | | | | | | types larger than i128. Currently the X86 backend doesn't support types larger than i128 very well. For example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted. This fix changes the cost model to never hoist constants for types larger than i128. Once the codegen issues have been resolved, the cost model can be updated to allow also larger types. This is related to <rdar://problem/16954938> llvm-svn: 209162
* [X86] Add ISel patterns to improve the selection of TZCNT and LZCNT.Andrea Di Biagio2014-05-191-0/+81
| | | | | | | | | | Instructions TZCNT (requires BMI1) and LZCNT (requires LZCNT), always provide the operand size as output if the input operand is zero. We can take advantage of this knowledge during instruction selection stage in order to simplify a few corner case. llvm-svn: 209159
* Added more insertps optimizationsFilipe Cabecinhas2014-05-192-11/+72
| | | | | | | | | | | | | | | | | | | | Summary: When inserting an element that's coming from a vector load or a broadcast of a vector (or scalar) load, combine the load into the insertps instruction. Added PerformINSERTPSCombine for the case where we need to fix the load (load of a vector + insertps with a non-zero CountS). Added patterns for the broadcasts. Also added tests for SSE4.1, AVX, and AVX2. Reviewers: delena, nadav, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3581 llvm-svn: 209156
* SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the ↵Benjamin Kramer2014-05-191-1/+17
| | | | | | | | | | bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-172-22/+20
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
* [x86] Fix a bad predicate I spotted by inspection -- pshufhw and pshuflwChandler Carruth2014-05-171-2/+2
| | | | | | | | | | were added in SSE2, no SSSE3. Found this while auditing all uses of SSSE3 in the X86 target. I don't actually expect this to make a significant difference on anything and I don't have any detailed test cases but I updated the existing test cases that already covered some of this code path. llvm-svn: 209056
OpenPOWER on IntegriCloud