summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
* [X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS ↵Simon Pilgrim2016-10-071-0/+11
| | | | | | | | | | | | | | commutation MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors. This patch inserts a vreg copy mi to ensure that the registers are correct. Fix for PR30607 Differential Revision: https://reviews.llvm.org/D25280 llvm-svn: 283539
* [X86] Fix patterns for VPMULLD and VPCMPEQQ to not require aligned loads.Craig Topper2016-10-071-2/+2
| | | | llvm-svn: 283524
* [X86] Remove unused PatFrags. NFCCraig Topper2016-10-071-5/+0
| | | | llvm-svn: 283523
* Target: Remove unused patterns and transforms. NFC.Peter Collingbourne2016-10-071-4/+0
| | | | llvm-svn: 283515
* [X86] Preserve BasePtr for LEA64_32rMichael Kuperstein2016-10-061-3/+5
| | | | | | | | | | | | When replacing FrameIndex with BasePtr, we must preserve BasePtr for LEA64_32r since BasePtr is used later for stack adjustment if it is the same as StackPtr. Patch by H.J Lu <hjl.tools@gmail.com> Differential Revision: https://reviews.llvm.org/D23575 llvm-svn: 283486
* [X86] Fix intel syntax push parsing bugNirav Dave2016-10-061-2/+29
| | | | | | | | | | | | | | | Change erroneous parsing of push immediate instructions in intel syntax to default to pointer size by rewriting into the ATT style for matching. This fixes PR22028. Reviewers: majnemer, rnk Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D25288 llvm-svn: 283457
* [Triple] Add triple for FuchsiaPetr Hosek2016-10-063-0/+14
| | | | | | | | Fuchsia is a new operating system. Differential Revision: https://reviews.llvm.org/D25116 llvm-svn: 283419
* Modify df_iterator to support post-order actionsDavid Callahan2016-10-051-1/+1
| | | | | | | | | | | | Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes. It will however allow a client to distinguish back from cross edges in a DFS tree. Reviewers: nadav, mehdi_amini, dberlin Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits Differential Revision: https://reviews.llvm.org/D25191 llvm-svn: 283391
* Revert r282920 "X86: Allow conditional tail calls in Win64 "leaf" functions ↵Hans Wennborg2016-10-052-6/+6
| | | | | | | | | (PR26302)" This is suspected to cause a miscompile in Chromium. Reverting while investigating. llvm-svn: 283329
* [X86] Don't randomly encode %rip where illegalDouglas Katzman2016-10-052-4/+27
| | | | | | Differential Revision: https://reviews.llvm.org/D25112 llvm-svn: 283326
* [Target] move reciprocal estimate settings from TargetOptions to TargetLoweringSanjay Patel2016-10-042-12/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | The motivation for the change is that we can't have pseudo-global settings for codegen living in TargetOptions because that doesn't work with LTO. Ideally, these reciprocal attributes will be moved to the instruction-level via FMF, metadata, or something else. But making them function attributes is at least an improvement over the current state. The ingredients of this patch are: Remove the reciprocal estimate command-line debug option. Add TargetRecip to TargetLowering. Remove TargetRecip from TargetOptions. Clean up the TargetRecip implementation to work with this new scheme. Set the default reciprocal settings in TargetLoweringBase (everything is off). Update the PowerPC defaults, users, and tests. Update the x86 defaults, users, and tests. Note that if this patch needs to be reverted, the related clang patch checked in at r283251 should be reverted too. Differential Revision: https://reviews.llvm.org/D24816 llvm-svn: 283252
* [X86] Add MOV8rm_NOREX to switch in isReallyTriviallyReMaterializable to ↵Craig Topper2016-10-041-0/+1
| | | | | | match MOV8rm. llvm-svn: 283184
* [x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics ↵Sanjay Patel2016-10-031-17/+27
| | | | | | | | | | | | | | | | (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 llvm-svn: 283119
* [AVX-512] Remove isCheapAsAMove flag from VMOVAPSZ128rm_NOVLX and friends.Craig Topper2016-10-031-1/+1
| | | | | | This was accidentally copy and pasted from other Pseudos in the file. llvm-svn: 283084
* [X86] Mark all sizes of (V)MOVUPD as trivially rematerializable.Craig Topper2016-10-033-24/+23
| | | | | | I don't know for sure that we truly needs this, but its the only vector load that isn't rematerializable. Making it consistent allows it to not be a special case in the td files. llvm-svn: 283083
* [X86][AVX2] Add support for combining target shuffles to VPERMD/VPERMPSSimon Pilgrim2016-10-021-3/+23
| | | | llvm-svn: 283080
* [X86][AVX] Ensure broadcast loads respect dependenciesSimon Pilgrim2016-10-021-0/+11
| | | | | | | | | | | | To allow broadcast loads of a non-zero'th vector element, lowerVectorShuffleAsBroadcast can replace a load with a new load with an adjusted address, but unfortunately we weren't ensuring that the new load respected the same dependencies. This patch adds a TokenFactor and updates all dependencies of the old load to reference the new load instead. Bug found during internal testing. Differential Revision: https://reviews.llvm.org/D25039 llvm-svn: 283070
* [X86] Don't set i64 ADDC/ADDE/SUBC/SUBE as Custom if the target isn't ↵Craig Topper2016-10-021-7/+4
| | | | | | 64-bit. This way we don't have to catch them and do nothing with them in ReplaceNodeResults. llvm-svn: 283066
* [X86] Fix indentation. NFCCraig Topper2016-10-021-1/+1
| | | | llvm-svn: 283065
* [X86][SSE] Cleaned up shuffle decode assertion messagesSimon Pilgrim2016-10-011-7/+11
| | | | llvm-svn: 283050
* Fix signed/unsigned warningSimon Pilgrim2016-10-011-2/+2
| | | | llvm-svn: 283041
* [X86][SSE] Add support for combining target shuffles to binary BLENDSimon Pilgrim2016-10-011-4/+30
| | | | | | We already had support for 1-input BLEND with zero - this adds support for 2-input BLEND as well. llvm-svn: 283040
* [X86][SSE] Always combine target shuffles to MOVSD/MOVSSSimon Pilgrim2016-10-013-10/+19
| | | | | | | | Now we can commute to BLENDPD/BLENDPS on SSE41+ targets if necessary, so simplify the combine matching where we can. This required me to add a couple of scalar math movsd/moss fold patterns that hadn't been needed in the past. llvm-svn: 283038
* [X86][SSE] Enable commutation from MOVSD/MOVSS to BLENDPD/BLENDPS on SSE41+ ↵Simon Pilgrim2016-10-012-0/+31
| | | | | | | | | | | | targets Instead of selecting between MOVSD/MOVSS and BLENDPD/BLENDPS at shuffle lowering by subtarget this will help us select the instruction based on actual commutation requirements. We could possibly add BLENDPD/BLENDPS -> MOVSD/MOVSS commutation and MOVSD/MOVSS memory folding using a similar approach if it proves useful I avoided adding AVX512 handling as I'm not sure when we should be making use of VBLENDPD/VBLENDPS on EVEX targets llvm-svn: 283037
* [X86] Cleanup patterns for using VMOVDDUP for broadcasts.Craig Topper2016-10-011-6/+6
| | | | | | | | -Remove OptForSize. Not all of the backend follows the same rules for creating broadcasts and there is no conflicting pattern. -Don't stop selecting VEX VMOVDDUP when AVX512 is supported. We need VLX for EVEX VMOVDDUP. -Only use VMOVDDUP for v2i64 broadcasts if AVX2 is not supported. llvm-svn: 283020
* Use StringRef instead of raw pointers in MCAsmInfo/MCInstrInfo APIs (NFC)Mehdi Amini2016-10-013-8/+8
| | | | llvm-svn: 283018
* [AVX-512] Add EVEX versions of VPBROADCASTW patterns with truncated i32 loads.Craig Topper2016-10-012-1/+18
| | | | llvm-svn: 283015
* Use StringRef in Pass/PassManager APIs (NFC)Mehdi Amini2016-10-0114-17/+15
| | | | llvm-svn: 283004
* X86: Allow conditional tail calls in Win64 "leaf" functions (PR26302)Hans Wennborg2016-09-302-6/+6
| | | | | | | | | | | We can't use Jcc to leave a Win64 function in general, because that confuses the unwinder. However, for "leaf" functions, that is, functions where the return address is always on top of the stack and which don't have unwind info, it's OK. Differential Revision: https://reviews.llvm.org/D24836 llvm-svn: 282920
* [AVX-512] Store address operand should be an input operand for the special ↵Craig Topper2016-09-301-4/+4
| | | | | | stack spilling pseudos for XMM16-31 and YMM16-31 without VLX. llvm-svn: 282843
* [AVX-512] Add the special stack spilling pseudos for XMM16-31 and YMM16-31 ↵Craig Topper2016-09-301-0/+8
| | | | | | without VLX to teh isFrameLoadOpcode and isFrameStoreOpcode. llvm-svn: 282842
* Revert r282835 "[AVX-512] Always use the full 32 register vector classes for ↵Craig Topper2016-09-301-15/+30
| | | | | | | | addRegisterClass regardless of whether AVX512/VLX is enabled or not." Turns out this doesn't pass verify-machineinstrs. llvm-svn: 282841
* [X86] Add AVX-512 VTs to findRepresentativeClass as well as v16i16 which was ↵Craig Topper2016-09-301-3/+5
| | | | | | | | also missing. Change register class to include the extra 16 AVX512 registers. I'm not completely sure what this method does or why all the 256-bit VTs returned VR128RegClass when the comments on the method definiton say it should return the largest super register class. I just figured AVX-512 should be similar. llvm-svn: 282836
* [AVX-512] Always use the full 32 register vector classes for ↵Craig Topper2016-09-301-30/+15
| | | | | | | | | | addRegisterClass regardless of whether AVX512/VLX is enabled or not. If AVX512 is disabled, the registers should already be marked reserved. Pattern predicates and register classes on instructions should take care of most of the rest. Loads/stores and physical register copies for XMM16-31 and YMM16-31 without VLX have already been taken care of. I'm a little unclear why this changed the register allocation of the SSE2 run of the sad.ll test, but the registers selected appear to be valid after this change. llvm-svn: 282835
* [X86] Don't preserve Win64 SSE CSRs when SSE is disabledReid Kleckner2016-09-302-2/+9
| | | | | | | | | Code that doesn't use floating point and doesn't use SSE (kernel code) shouldn't save and restore SSE registers. Fixes PR30503 llvm-svn: 282819
* [X86] Avoid "unused" warnings if no assertsDouglas Katzman2016-09-291-2/+4
| | | | llvm-svn: 282732
* [X86][SSE] Added common helper for shuffle mask constant pool decodes.Simon Pilgrim2016-09-291-164/+136
| | | | | | | | | | The shuffle mask decodes have a large amount of repeated code extracting/splitting mask values from Constant data. This patch pulls all of this duplicated code into a single helper function to identify undef elements and combine/split constant integer data into the requested shuffle mask elements. Updated PSHUFB/VPERMIL/VPERMIL2/VPPERM decoders to use it (VPERMV/VPERMV3 could be converted as well in the future). llvm-svn: 282720
* [AVX-512] Support spills of XMM16-31 and YMM16-31 when VLX isn't available.Craig Topper2016-09-292-8/+137
| | | | | | | | This adds new pseudo instructions that can be selected during register allocation to represent loads and stores of XMM/YMM registers when AVX512F is available, but VLX isn't. They will be converted to VEX encoded moves if the register turns out to be XMM0-15/YMM0-15. Otherwise either an EVEX VEXTRACT(store) or VBROADCAST(load) will be used. Fixes one of the cases from PR29112. llvm-svn: 282690
* [AVX-512] Replicate pattern from AVX to select VMOVDDUP for (v2f64 ↵Craig Topper2016-09-291-2/+6
| | | | | | (X86VBroadcast f64:)). Add AVX512VL to command line of existing AVX2 test that hits this condition. llvm-svn: 282688
* [X86] Add EVEX encoded VBROADCASTSS/SD and VPBROADCASTD/Q to execution ↵Craig Topper2016-09-291-0/+10
| | | | | | domain fixing table. llvm-svn: 282687
* [X86] Remove AddedComplexity adjustments that don't seem to be needed.Craig Topper2016-09-291-6/+4
| | | | llvm-svn: 282686
* [X86] Add VBROADCASTF128/VBROADCASTI128 to execution domain fixing tables.Craig Topper2016-09-291-1/+2
| | | | llvm-svn: 282684
* [x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel)Marina Yatsina2016-09-281-0/+6
| | | | | | | | | | Implement 'retn' simply by aliasing it to the relevant 'ret' instruction Commit on behalf of coby Differential Revision: https://reviews.llvm.org/D24346 llvm-svn: 282601
* [X86][FastISel] Use a COPY from K register to a GPR instead of a K operationGuy Blank2016-09-281-27/+31
| | | | | | | | | | | The KORTEST was introduced due to a bug where a TEST instruction used a K register. but, turns out that the opposite case of KORTEST using a GPR is now happening The change removes the KORTEST flow and adds a COPY instruction from the K reg to a GPR. Differential Revision: https://reviews.llvm.org/D24953 llvm-svn: 282580
* Strip trailing whitespaceSimon Pilgrim2016-09-281-1/+1
| | | | llvm-svn: 282579
* [x86] add folds for FP logic with vector zerosSanjay Patel2016-09-271-17/+34
| | | | | | | | | | | | The 'or' case shows up in copysign. The copysign code also had redundant checking for a scalar zero operand with 'and', so I removed that. I'm not sure how to test vector 'and', 'andn', and 'xor' yet, but it seems better to just include all of the logic ops since we're fixing 'or' anyway. llvm-svn: 282546
* [x86] use isNullFPConstant(); NFCISanjay Patel2016-09-271-40/+35
| | | | | | Also, put the related FP logic functions together to see the similarities. llvm-svn: 282522
* [X86] Use std::max to calculate alignment instead of assuming RC->getSize() ↵Craig Topper2016-09-271-2/+2
| | | | | | will not return a value greater than 32. I think it theoretically could be 64 for AVX-512. llvm-svn: 282471
* [CodeGen] Add support for emitting .init_array instead of .ctors on FreeBSD.Davide Italiano2016-09-263-0/+15
| | | | | PR: 30494 llvm-svn: 282451
* Add support for Code16GCCNirav Dave2016-09-261-20/+42
| | | | | | | | | | | | | [X86] The .code16gcc directive parses X86 assembly input in 32-bit mode and outputs in 16-bit mode. Teach parser to switch modes appropriately. Reviewers: dwmw2, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D20109 llvm-svn: 282430
OpenPOWER on IntegriCloud