summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86] Fix checked arithmetic for i8 on X86.Andrea Di Biagio2014-06-021-2/+3
| | | | | | | | | | | When lowering a ISD::BRCOND into a test+branch, make sure that we always use the correct condition code to emit the test operation. This fixes PR19858: "i8 checked mul is wrong on x86". Patch by Keno Fisher! llvm-svn: 210032
* Have the TLOF creation take a Triple rather than needing a subtarget.Eric Christopher2014-05-311-11/+8
| | | | llvm-svn: 209937
* [X86] Add two combine rules to simplify dag nodes introduced during type ↵Andrea Di Biagio2014-05-301-0/+53
| | | | | | | | | | | | | | | | | | | | | | | legalization when promoting nodes with illegal vector type. This patch teaches the backend how to simplify/canonicalize dag node sequences normally introduced by the backend when promoting certain dag nodes with illegal vector type. This patch adds two new combine rules: 1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) -> (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>) 2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) -> (shuffle (BINOP A, B), Undef, <Mask>). Both rules are only triggered on the type-legalized DAG. In particular, rule 1. is a target specific combine rule that attempts to sink a bitconvert into the operands of a binary operation. Rule 2. is a target independet rule that attempts to move a shuffle immediately after a binary operation. llvm-svn: 209930
* Separate the check for blend shuffle_vector masksFilipe Cabecinhas2014-05-301-25/+42
| | | | | | | | | | | | | | | | | Summary: Separate the check for blend shuffle_vector masks into isBlendMask. This function will also be used to check if a vector shuffle is legal. No change in functionality was intended, but we ended up improving codegen on two tests, which were being (more) optimized only if the resulting shuffle was legal. Reviewers: nadav, delena, andreadb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3964 llvm-svn: 209923
* [X86] Remove AVX1 vbroadcast intrinsicsAdam Nemet2014-05-291-15/+17
| | | | | | | | | | | | | | | | | | | | | The corresponding CFE patch replaces these intrinsics with vector initializers in avxintrin.h. This patch removes the LLVM intrinsics from the backend. We now stop lowering at X86ISD::VBROADCAST custom node rather than lowering that further to the intrinsics. The patch only changes VBROADCASTS* and leaves VBROADCAST[FI]128 to continue to use intrinsics. As explained in the CFE patch, the reason is that we currently don't generate as good code for them without the intrinsics. CodeGen/X86/avx-vbroadcast.ll already provides coverage for this change. It checks that for a series of insertelements we generate the appropriate vbroadcast instruction. Also verified that there was no assembly change in the test-suite before and after this patch. llvm-svn: 209864
* [pr19844] Add thread local mode to aliases.Rafael Espindola2014-05-281-11/+2
| | | | | | | | | | This matches gcc's behavior. It also seems natural given that aliases contain other properties that govern how it is accessed (linkage, visibility, dll storage). Clang still has to be updated to expose this feature to C. llvm-svn: 209759
* Emit data or code export directives based on the type.Rafael Espindola2014-05-251-7/+3
| | | | | | | | | | | | | | | | | | | | | | Currently we look at the Aliasee to decide what type of export directive to use. It seems better to use the type of the alias directly. This is similar to how we handle the alias having the same address but other attributes (linkage, visibility) from the aliasee. With this patch it is now possible to do things like target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-windows-msvc" @foo = global [6 x i8] c"\B8*\00\00\00\C3", section ".text", align 16 @f = dllexport alias i32 (), [6 x i8]* @foo !llvm.module.flags = !{!0} !0 = metadata !{i32 6, metadata !"Linker Options", metadata !1} !1 = metadata !{metadata !2, metadata !3} !2 = metadata !{metadata !"/DEFAULTLIB:libcmt.lib"} !3 = metadata !{metadata !"/DEFAULTLIB:oldnames.lib"} llvm-svn: 209600
* Delete dead code.Rafael Espindola2014-05-231-4/+0
| | | | | | GV is never used past this point. This was probably a copy and paste error. llvm-svn: 209518
* [X86] Improve the lowering of BITCAST from MVT::f64 to MVT::v4i16/MVT::v8i8.Andrea Di Biagio2014-05-221-18/+38
| | | | | | | | | | | | | This patch teaches the x86 backend how to efficiently lower ISD::BITCAST dag nodes from MVT::f64 to MVT::v4i16 (and vice versa), and from MVT::f64 to MVT::v8i8 (and vice versa). This patch extends the logic from revision 208107 to also handle MVT::v4i16 and MVT::v8i8. Also, this patch correctly propagates Undef values when performing the widening of a vector (example: when widening from v2i32 to v4i32, the upper 64bits of the resulting vector are 'undef'). llvm-svn: 209451
* Segmented stacks: omit __morestack call when there's no frame.Tim Northover2014-05-221-5/+9
| | | | | | Patch by Florian Zeitz llvm-svn: 209436
* Override runOnMachineFunction for X86ISelDAGToDAG so that we canEric Christopher2014-05-221-0/+7
| | | | | | reset the subtarget on each function. llvm-svn: 209384
* Avoid using subtarget features when adding X86 specific passes toEric Christopher2014-05-225-14/+17
| | | | | | the pass pipeline. llvm-svn: 209382
* Remove extra local variable.Eric Christopher2014-05-221-2/+1
| | | | llvm-svn: 209381
* Rename createGlobalBaseRegPass -> createX86GlobalBaseRegPass to makeEric Christopher2014-05-223-4/+4
| | | | | | it obvious that it's a target specific pass. llvm-svn: 209380
* Fix typo.Eric Christopher2014-05-221-1/+1
| | | | llvm-svn: 209377
* Fix compilation issues.Eric Christopher2014-05-211-2/+3
| | | | llvm-svn: 209342
* Make early if conversion dependent upon the subtarget and addEric Christopher2014-05-213-11/+16
| | | | | | | a subtarget hook to enable. Unconditionally add to the pass pipeline for targets that might want to use it. No functional change. llvm-svn: 209340
* [X86] Fix a bug in the lowering of BLENDI introduced in r209043.Quentin Colombet2014-05-211-3/+7
| | | | | | | | | | | | | | | | | | | ISD::VSELECT mask uses 1 to identify the first argument and 0 to identify the second argument. On the other hand, BLENDI uses 0 to identify the first argument and 1 to identify the second argument. Fix the generation of the blend mask to account for this difference. The bug did not show up with r209043, because we were not checking for the actual arguments of the blend instruction! This commit also fixes the test cases. Note: The same mask works for the BLENDr variant because the arguments are swapped during instruction selection (see the BLENDXXrr patterns). <rdar://problem/16975435> llvm-svn: 209324
* [asan] Fix x86-32 asm instrumentation to preserve flags.Evgeniy Stepanov2014-05-211-2/+1
| | | | | | Patch by Yuri Gorshenin. llvm-svn: 209280
* Add parentheses to suppress the gcc warning '-Wparentheses'.Simon Atanasyan2014-05-201-2/+2
| | | | | | No functional changes. llvm-svn: 209203
* [X86] Tune LEA usage for SilvermontAlexey Volkov2014-05-207-14/+102
| | | | | | | | | | | | | According to Intel Software Optimization Manual on Silvermont in some cases LEA is better to be replaced with ADD instructions: "The rule of thumb for ADDs and LEAs is that it is justified to use LEA with a valid index and/or displacement for non-destructive destination purposes (especially useful for stack offset cases), or to use a SCALE. Otherwise, ADD(s) are preferable." Differential Revision: http://reviews.llvm.org/D3826 llvm-svn: 209198
* [ConstantHoisting][X86] Change the cost model to never hoist constants for ↵Juergen Ributzka2014-05-191-2/+13
| | | | | | | | | | | | | | | types larger than i128. Currently the X86 backend doesn't support types larger than i128 very well. For example an i192 multiply will assert in codegen when the 2nd argument is a constant and the constant got hoisted. This fix changes the cost model to never hoist constants for types larger than i128. Once the codegen issues have been resolved, the cost model can be updated to allow also larger types. This is related to <rdar://problem/16954938> llvm-svn: 209162
* [X86] Add ISel patterns to improve the selection of TZCNT and LZCNT.Andrea Di Biagio2014-05-191-0/+81
| | | | | | | | | | Instructions TZCNT (requires BMI1) and LZCNT (requires LZCNT), always provide the operand size as output if the input operand is zero. We can take advantage of this knowledge during instruction selection stage in order to simplify a few corner case. llvm-svn: 209159
* Added more insertps optimizationsFilipe Cabecinhas2014-05-192-11/+72
| | | | | | | | | | | | | | | | | | | | Summary: When inserting an element that's coming from a vector load or a broadcast of a vector (or scalar) load, combine the load into the insertps instruction. Added PerformINSERTPSCombine for the case where we need to fix the load (load of a vector + insertps with a non-zero CountS). Added patterns for the broadcasts. Also added tests for SSE4.1, AVX, and AVX2. Reviewers: delena, nadav, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3581 llvm-svn: 209156
* SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the ↵Benjamin Kramer2014-05-191-1/+17
| | | | | | | | | | bswap not. - On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though. - On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal. - On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled. llvm-svn: 209123
* Target: remove old constructors for CallLoweringInfoSaleem Abdulrasool2014-05-172-22/+20
| | | | | | | | | | This is mostly a mechanical change changing all the call sites to the newer chained-function construction pattern. This removes the horrible 15-parameter constructor for the CallLoweringInfo in favour of setting properties of the call via chained functions. No functional change beyond the removal of the old constructors are intended. llvm-svn: 209082
* [x86] Fix a bad predicate I spotted by inspection -- pshufhw and pshuflwChandler Carruth2014-05-171-2/+2
| | | | | | | | | | were added in SSE2, no SSSE3. Found this while auditing all uses of SSSE3 in the X86 target. I don't actually expect this to make a significant difference on anything and I don't have any detailed test cases but I updated the existing test cases that already covered some of this code path. llvm-svn: 209056
* Implemented special cases for PerformVSELECTCombine.Filipe Cabecinhas2014-05-161-0/+62
| | | | | | | | | | vselects with constant masks, after legalization, will get turned into specialized shuffle_vectors so they can be matched to blend+imm instructions. Fixed some tests. llvm-svn: 209044
* Lower vselects into X86ISD::BLENDI when appropriate.Filipe Cabecinhas2014-05-161-1/+83
| | | | | | | | | | | | | | | | LowerVSELECT will, if possible, generate a X86ISD::BLENDI DAG node if the condition is constant and we can emit that instruction, given the subtarget. This is not enough for all cases. An additional SELECTCombine optimization will be committed. Fixed tests that were expecting variable blends but where a blend+imm can be generated. Added test where we can't emit blend+immediate. Added avx2 blend+imm tests. llvm-svn: 209043
* Implemented LowerVSELECT to custom lower some instructions.Filipe Cabecinhas2014-05-162-16/+46
| | | | | | | | No functionality change intended. The types that previously were set to lower as Expand or Legal are doing the same thing with this lowering function. llvm-svn: 209042
* Delete getAliasedGlobal.Rafael Espindola2014-05-163-3/+3
| | | | llvm-svn: 209040
* X86: disable printing of bare "mov" aliasesTim Northover2014-05-161-3/+3
| | | | | | | | | | | In AT&T syntax, we should probably print the full "movl" or "movw". TableGen used to ignore these aliases because it was miscounting the number of operands. This fixes the issue. This will be tested when the TableGen "should I print this Alias" heuristic is fixed (very soon). llvm-svn: 208963
* [X86] Teach the backend how to fold SSE4.1/AVX/AVX2 blend intrinsics.Andrea Di Biagio2014-05-151-2/+54
| | | | | | | | | | | | | Added target specific combine rules to fold blend intrinsics according to the following rules: 1) fold(blend A, A, Mask) -> A; 2) fold(blend A, B, <allZeros>) -> A; 3) fold(blend A, B, <allOnes>) -> B. Added two new tests to verify that the new folding rules work for all the optimized blend intrinsics. llvm-svn: 208895
* TableGen: use correct MIOperand when printing aliasesTim Northover2014-05-152-20/+20
| | | | | | | | | | | | | | Previously, TableGen assumed that every aliased operand consumed precisely 1 MachineInstr slot (this was reasonable because until a couple of days ago, nothing more complicated was eligible for printing). This allows a couple more ARM64 aliases to print so we can remove the special code. On the X86 side, I've gone for explicit AT&T size specifiers as the default, so turned off a few of the aliases that would have just started printing. llvm-svn: 208880
* TableGen/ARM64: print aliases even if they have syntax variants.Tim Northover2014-05-152-22/+34
| | | | | | | To get at least one use of the change (and some actual tests) in with its commit, I've enabled the AArch64 & ARM64 NEON mov aliases. llvm-svn: 208867
* Fix typosAlp Toker2014-05-151-1/+1
| | | | llvm-svn: 208839
* Rename ComputeMaskedBits to computeKnownBits. "Masked" has beenJay Foad2014-05-144-16/+16
| | | | | | inappropriate since it lost its Mask parameter in r154011. llvm-svn: 208811
* X86: If we have an instruction that sets a flag and a zero test on the input ↵Benjamin Kramer2014-05-141-3/+63
| | | | | | | | | | | | | | | | | | | | | of that instruction try to eliminate the test. For example tzcntl %edi, %ebx testl %edi, %edi je .label can be rewritten into tzcntl %edi, %ebx jb .label A minor complication is that tzcnt sets CF instead of ZF when the input is zero, we have to rewrite users of the flags from ZF to CF. Currently we recognize patterns using lzcnt, tzcnt and popcnt. Differential Revision: http://reviews.llvm.org/D3454 llvm-svn: 208788
* Try to fix an SDAG dependence issue with sretReid Kleckner2014-05-121-15/+18
| | | | | | | | | | | | | | | | r208453 added support for having sret on the second parameter. In that change, the code for copying sret into a virtual register was hoisted into the loop that lowers formal parameters. This caused a "Wrong topological sorting" assertion failure during scheduling when a parameter is passed in memory. This change undoes that by creating a second loop that deals with sret. I'm worried that this fix is incomplete. I don't fully understand the dependence issues. However, with this change we produce the same DAGs we used to produce, so if they are broken, they are just as broken as they have always been. llvm-svn: 208637
* TableGen: use PrintMethods to print more aliasesTim Northover2014-05-121-0/+2
| | | | llvm-svn: 208607
* Silencing an MSVC warning about not all control paths returning a value ↵Aaron Ballman2014-05-121-0/+1
| | | | | | (even though the switch is fully covered). No functional change. llvm-svn: 208565
* Remove an always true argument.Rafael Espindola2014-05-121-1/+1
| | | | llvm-svn: 208557
* X86: Make sure that we have SSE4.1 before we generate insertps nodes.Benjamin Kramer2014-05-121-1/+1
| | | | | | PR19721. llvm-svn: 208552
* X86ISelLowering.cpp:LowerINTRINSIC_W_CHAIN(): Prune impossible "default:" ↵NAKAMURA Takumi2014-05-121-3/+0
| | | | | | [-Wcovered-switch-default] llvm-svn: 208533
* Fixed compilation issueElena Demikhovsky2014-05-121-0/+1
| | | | llvm-svn: 208524
* AVX-512: changes in intrinsicsElena Demikhovsky2014-05-124-192/+231
| | | | | | | | | 1) Changed gather and scatter intrinsics. Now they are aligned with GCC built-ins. There is no more non-masked form. Masked intrinsic receives -1 if all lanes are executed. 2) I changed the function that works with intrinsics inside X86ISelLowering.cpp. I put all intrinsics in one table. I did it for INTRINSICS_W_CHAIN and plan to put all intrinsics from WO_CHAIN set to the same table in order to avoid the long-long "switch". (I wanted to use static map initialization that allowed by C++11 but I wasn't able to compile it on VS2012). 3) I added gather/scatter prefetch intrinsics. 4) I fixed MRMm encoding for masked instructions. llvm-svn: 208522
* Pass the value type to TLI::getRegisterByNameHal Finkel2014-05-112-2/+3
| | | | | | | | | | | | | We must validate the value type in TLI::getRegisterByName, because if we don't and the wrong type was used with the IR intrinsic, then we'll assert (because we won't be able to find a valid register class with which to construct the requested copy operation). For PPC64, additionally, the type information is necessary to decide between the 64-bit register and the 32-bit subregister. No functionality change. llvm-svn: 208508
* Add 'override' to getRegisterByName in *ISelLowering.hHal Finkel2014-05-111-1/+1
| | | | | | No functionality change intended. llvm-svn: 208507
* Fixed a bug when lowering build_vector (PR19694)Filipe Cabecinhas2014-05-111-3/+8
| | | | | | | | When lowering build_vector to an insertps, we would still lower it, even if the source vectors weren't v4x32. This would break on avx if the source was a v8x32. We now check the type of the source vectors. llvm-svn: 208487
* Revert "[ms-cxxabi] Add a new calling convention that swaps 'this' and 'sret'"Reid Kleckner2014-05-092-37/+0
| | | | | | | | | | | | | | This reverts commit r200561. This calling convention was an attempt to match the MSVC C++ ABI for methods that return structures by value. This solution didn't scale, because it would have required splitting every CC available on Windows into two: one for methods and one for free functions. Now that we can put sret on the second arg (r208453), and Clang does that (r208458), revert this hack. llvm-svn: 208459
OpenPOWER on IntegriCloud