summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* more space; NFCSanjay Patel2015-03-301-1/+1
| | | | llvm-svn: 233554
* AVX-512: blank lines, duplicated tests, no functional changesElena Demikhovsky2015-03-301-21/+27
| | | | | | see comments http://reviews.llvm.org/D6835 llvm-svn: 233528
* AVX-512: added intrinsics for VPAND, VPOR and VPXORElena Demikhovsky2015-03-301-0/+6
| | | | | | by Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 233525
* [X86] Remove FeatureAES for 'corei7' CPU. 'corei7' should match 'nehalem' ↵Craig Topper2015-03-301-11/+9
| | | | | | which doesn't have AES. Having AES and not PCLMUL makes 'corei7' halfway between Nehalem and Westmere. llvm-svn: 233517
* AVX-512: Fixed the "commutative" property flag in VPANDN instructionElena Demikhovsky2015-03-291-1/+1
| | | | | | By Asaf Badouh (asaf.badouh@intel.com) llvm-svn: 233489
* [X86] Read the feature bits from the subtarget that is passed to printInstAkira Hatanaka2015-03-282-5/+2
| | | | | | instead of from MCInstPrinter::AvailableFeatures. llvm-svn: 233485
* Partially revert the changes I made in r233473 to keep the code concise.Akira Hatanaka2015-03-281-137/+47
| | | | llvm-svn: 233474
* clang-format X86ATTInstPrinter.{h,cpp} before I make changes to these files.Akira Hatanaka2015-03-282-76/+156
| | | | llvm-svn: 233473
* [MCInstPrinter] Enable MCInstPrinter to change its behavior based on theAkira Hatanaka2015-03-274-4/+7
| | | | | | | | | | | | | | | | | | | | per-function subtarget. Currently, code-gen passes the default or generic subtarget to the constructors of MCInstPrinter subclasses (see LLVMTargetMachine::addPassesToEmitFile), which enables some targets (AArch64, ARM, and X86) to change their instprinter's behavior based on the subtarget feature bits. Since the backend can now use different subtargets for each function, instprinter has to be changed to use the per-function subtarget rather than the default subtarget. This patch takes the first step towards enabling instprinter to change its behavior based on the per-function subtarget. It adds a bit "PassSubtarget" to AsmWriter which tells table-gen to pass a reference to MCSubtargetInfo to the various print methods table-gen auto-generates. I will follow up with changes to instprinters of AArch64, ARM, and X86. llvm-svn: 233411
* Remove superfluous .str() and replace std::string concatenation with Twine.Yaron Keren2015-03-271-1/+1
| | | | llvm-svn: 233392
* comment cleanup; NFCSanjay Patel2015-03-261-5/+5
| | | | llvm-svn: 233293
* Remove outdated README-SSE.txt entries.Benjamin Kramer2015-03-261-78/+0
| | | | llvm-svn: 233292
* Use SDValue bool checks; NFC intendedSanjay Patel2015-03-261-20/+13
| | | | llvm-svn: 233289
* [X86][FastIsel] Teach how to select vector load instructions.Andrea Di Biagio2015-03-261-3/+34
| | | | | | | | | This patch teaches fast-isel how to select 128-bit vector load instructions. Added test CodeGen/X86/fast-isel-vecload.ll Differential Revision: http://reviews.llvm.org/D8605 llvm-svn: 233270
* [X86, AVX] improve insertion into zero element of 256-bit vectorSanjay Patel2015-03-251-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows AVX blend instructions to handle insertion into the low element of a 256-bit vector for the appropriate data types. For f32, instead of: vblendps $1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3] vblendps $15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7] we get: vblendps $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7] For f64, instead of: vmovsd %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1] vblendpd $3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3] we get: vblendpd $1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3] For the hardware-neglected integer data types, I left a TODO comment in the code and added regression tests for a follow-on patch. Differential Revision: http://reviews.llvm.org/D8609 llvm-svn: 233199
* [X86] Remove GetCpuIDAndInfo, GetCpuIDAndInfoEx and DetectFamilyModel ↵Craig Topper2015-03-252-149/+0
| | | | | | functions from X86 MC layer. They haven't been used since CPU autodetection was removed from X86Subtarget.cpp. llvm-svn: 233170
* X86: Fix frameescape when not using an FPReid Kleckner2015-03-241-5/+5
| | | | | | | | | | | We can't use TargetFrameLowering::getFrameIndexOffset directly, because Win64 really wants the offset from the stack pointer at the end of the prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(), which is a pretty close approximiation of that. It fails to handle cases with interestingly large stack alignments, which is pretty uncommon on Win64 and is TODO. llvm-svn: 233137
* [X86, AVX] recognize shufflevector with zero input as a vperm2 (PR22984)Sanjay Patel2015-03-241-20/+56
| | | | | | | | | | | | | | vperm2x128 instructions have the special ability (aka free hardware capability) to shuffle zero values into a vector. This patch recognizes that type of shuffle and generates the appropriate control byte. https://llvm.org/bugs/show_bug.cgi?id=22984 Differential Revision: http://reviews.llvm.org/D8563 llvm-svn: 233100
* Revert "Use std::bitset for SubtargetFeatures"Michael Kuperstein2015-03-245-31/+31
| | | | | | | | This reverts commit r233055. It still causes buildbot failures (gcc running out of memory on several platforms, and a self-host failure on arm), although less than the previous time. llvm-svn: 233068
* Use std::bitset for SubtargetFeaturesMichael Kuperstein2015-03-245-31/+31
| | | | | | | | | | | | | Previously, subtarget features were a bitfield with the underlying type being uint64_t. Since several targets (X86 and ARM, in particular) have hit or were very close to hitting this bound, switching the features to use a bitset. No functional change. The first time this was committed (r229831), it caused several buildbot failures. At least some of the ARM ones were due to gcc/binutils issues, and should now be fixed. Differential Revision: http://reviews.llvm.org/D8542 llvm-svn: 233055
* Refactor: Simplify boolean expressions in x86 targetDavid Blaikie2015-03-233-8/+5
| | | | | | | | | | Simplify boolean expressions with `true` and `false` with `clang-tidy` Patch by Richard Thomson. Differential Revision: http://reviews.llvm.org/D8519 llvm-svn: 233002
* Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.Benjamin Kramer2015-03-231-1/+1
| | | | llvm-svn: 232998
* Silence a GCC warningDavid Majnemer2015-03-221-2/+2
| | | | llvm-svn: 232923
* Fixed MSVC compile warning issue introduced in r232837Simon Pilgrim2015-03-221-1/+2
| | | | | | - was reporting 'warning C4715: 'getType32' : not all control paths return a value' llvm-svn: 232913
* Cache the Function dependent subtarget on the MachineFunction.Eric Christopher2015-03-211-1/+0
| | | | | | | | | | | As preparation for removing the getSubtargetImpl() call from TargetMachine go ahead and flip the switch on caching the function dependent subtarget and remove the bare getSubtargetImpl call from the X86 port. As part of this add a few tests that show we can generate code and assemble on X86 based on features/cpu on the Function. llvm-svn: 232879
* [X86] Prefer blendps over insertps codegen for one special caseSanjay Patel2015-03-201-9/+22
| | | | | | | | | | | | | | | | | | | | With this patch, for this one exact case, we'll generate: blendps %xmm0, %xmm1, $1 instead of: insertps %xmm0, %xmm1, $0 If there's a memory operand available for load folding and we're optimizing for size, we'll still generate the insertps. The detailed performance data motivation for this may be found in D7866; in summary, blendps has 2-3x throughput vs. insertps on widely used chips. Differential Revision: http://reviews.llvm.org/D8332 llvm-svn: 232850
* X86: Make helper functions static. NFC.Benjamin Kramer2015-03-201-4/+4
| | | | llvm-svn: 232848
* Reorganize the x86 ELF relocation selection logic.Rafael Espindola2015-03-201-176/+198
| | | | | | | | | | | | | | | The main differences are: * Split in 32 and 64 bit functions. * First switch on the Modifier so that we have only one non fully covered switch. * Map the fixup kind first to a x86_64 (or i386) specific enum, to make it easy to handle cases like X86::reloc_riprel_4byte_movq_load. * Switch on IsPCRel last, which reduces code duplication. Fixes pr22308. llvm-svn: 232837
* Stripped trailing whitespace. NFC.Simon Pilgrim2015-03-201-15/+15
| | | | llvm-svn: 232822
* Reduce indentation after return. NFC.Rafael Espindola2015-03-201-138/+125
| | | | llvm-svn: 232814
* Use early returns. NFC.Rafael Espindola2015-03-201-104/+50
| | | | llvm-svn: 232813
* Fold a llvm_unreachable into an assert. NFC.Rafael Espindola2015-03-201-3/+3
| | | | llvm-svn: 232811
* clang-format a function. NFC.Rafael Espindola2015-03-201-12/+32
| | | | llvm-svn: 232810
* move insert, extract, concat helper functions closer to related helper ↵Sanjay Patel2015-03-191-156/+156
| | | | | | functions; NFCI llvm-svn: 232781
* [X86, AVX] use blends instead of insert128 with index 0Sanjay Patel2015-03-191-1/+44
| | | | | | | | | | | | | | | Another case of x86-specific shuffle strength reduction: avoid generating insert*128 instructions with index 0 because they are slower than their non-lane-changing blend equivalents. Shuffle lowering already catches most of these cases, but the zero vector case and some other paths such as in the modified test in vector-shuffle-256-v32.ll were getting through. Differential Revision: http://reviews.llvm.org/D8366 llvm-svn: 232773
* Split the object streamer callback in one per file format.Rafael Espindola2015-03-193-25/+7
| | | | | | | | | | | | | There are two main advantages to doing this * Targets that only need to handle one of the formats specially don't have to worry about the others. For example, x86 now only registers a constructor for the COFF streamer. * Changes to the arguments passed to one format constructor will not impact the other formats. llvm-svn: 232699
* two or more, use a for.Rafael Espindola2015-03-181-51/+32
| | | | llvm-svn: 232688
* [X86][SSE] Avoid scalarization of v2i64 vector shifts (REAPPLIED)Simon Pilgrim2015-03-181-13/+24
| | | | | | | | Fixed broken tests. Differential Revision: http://reviews.llvm.org/D8416 llvm-svn: 232682
* Revert "[X86][SSE] Avoid scalarization of v2i64 vector shifts" as itEric Christopher2015-03-181-24/+13
| | | | | | | | appears to have broken tests/bots. This reverts commit r232660. llvm-svn: 232670
* [X86][SSE] Avoid scalarization of v2i64 vector shiftsSimon Pilgrim2015-03-181-13/+24
| | | | | | | | | | | | Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets. This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware). Note - this patch only improves the SHL / LSHR logical shifts as only these are supported in SSE hardware. Differential Revision: http://reviews.llvm.org/D8416 llvm-svn: 232660
* Handle X86::reloc_riprel_4byte in 32 bits mode.Rafael Espindola2015-03-181-0/+1
| | | | | | | | We can get there with .code64. Fixes pr22349. llvm-svn: 232651
* Make EmitFunctionHeader a private helper.Rafael Espindola2015-03-171-3/+0
| | | | llvm-svn: 232481
* Pass in a "const Triple &T" instead of a raw StringRef.Rafael Espindola2015-03-161-5/+3
| | | | llvm-svn: 232429
* Remove unused argument. NFC.Rafael Espindola2015-03-161-3/+3
| | | | llvm-svn: 232428
* Fix uses of reserved identifiers starting with an underscore followed by an ↵David Blaikie2015-03-162-8/+8
| | | | | | | | | uppercase letter This covers essentially all of llvm's headers and libs. One or two weird cases I wasn't sure were worth/appropriate to fix. llvm-svn: 232394
* fix comments to match code; NFCSanjay Patel2015-03-161-3/+3
| | | | llvm-svn: 232385
* Use the i8 immediate cmp instructions when possible.Rafael Espindola2015-03-161-1/+8
| | | | llvm-svn: 232378
* Don't repeat names in comments and clang-format this function.Rafael Espindola2015-03-161-7/+10
| | | | llvm-svn: 232375
* Make each target map all inline assembly memory constraints to ↵Daniel Sanders2015-03-161-0/+6
| | | | | | | | | | | | | | | | | | | InlineAsm::Constraint_m. NFC. Summary: This is instead of doing this in target independent code and is the last non-functional change before targets begin to distinguish between different memory constraints when selecting code for the ISD::INLINEASM node. Next, each target will individually move away from the idea that all memory constraints behave like 'm'. Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8173 llvm-svn: 232373
* [llvm] Replacing asserts with static_asserts where appropriateGabor Horvath2015-03-162-2/+3
| | | | | | | | | | | | | | | | Summary: This patch consists of the suggestions of clang-tidy/misc-static-assert check. Reviewers: alexfh Reviewed By: alexfh Subscribers: xazax.hun, llvm-commits Differential Revision: http://reviews.llvm.org/D8343 llvm-svn: 232366
OpenPOWER on IntegriCloud