summaryrefslogtreecommitdiffstats
path: root/llvm/lib/Target/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* AVX512: Enabled VPBROADCASTB lowering for v64i8 vectors.Igor Breger2015-10-261-0/+1
| | | | | | Differential Revision: http://reviews.llvm.org/D13896 llvm-svn: 251287
* AVX-512: Use correct extract vector length.Igor Breger2015-10-261-1/+1
| | | | | | | | Bug https://llvm.org/bugs/show_bug.cgi?id=25318 Differential Revision: http://reviews.llvm.org/D14062 llvm-svn: 251285
* AVX512: Add AVX-512 not materializable instructions. Igor Breger2015-10-261-1/+29
| | | | | | | | | | Otherwise value can be reused , despite its value could be changed - produces incorrect assembler. https://llvm.org/bugs/show_bug.cgi?id=25270 Differential Revision: http://reviews.llvm.org/D14057 llvm-svn: 251275
* [MC] Don't crash when .word is given bogus valuesDavid Majnemer2015-10-261-1/+10
| | | | | | | | | | We didn't validate that the .word directive was given a sane value, leading to crashes when we attempt to write out the object file. Instead, perform some validation and issue a diagnostic pointing at the start of the diagnostic. llvm-svn: 251270
* Convert assert(false) into llvm_unreachable where it makes sense.Benjamin Kramer2015-10-251-2/+1
| | | | llvm-svn: 251266
* [X86][SSE4A] Fix for EXTRQI shuffle lowering.Simon Pilgrim2015-10-251-2/+2
| | | | | | Incorrect range test - found during fuzz testing. llvm-svn: 251245
* Scalarizer for masked.gather and masked.scatter intrinsics.Elena Demikhovsky2015-10-252-0/+29
| | | | | | | | | | When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations. If the mask is not constant, the scalarizer will build a chain of conditional basic blocks. I added isLegalMaskedGather() isLegalMaskedScatter() APIs. Differential Revision: http://reviews.llvm.org/D13722 llvm-svn: 251237
* [X86] Use correct calling convention for MCU psABI libcallsMichael Kuperstein2015-10-252-0/+27
| | | | | | | | | | | | When using the MCU psABI, compiler-generated library calls should pass some parameters in-register. However, since inreg marking for x86 is currently done by the front end, it will not be applied to backend-generated calls. This is a workaround for PR3997, which describes a similar issue for -mregparm. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251223
* [X86] Add support for elfiamcu tripleMichael Kuperstein2015-10-251-0/+1
| | | | | | | | This adds support for the i?86-*-elfiamcu triple, which indicates the IAMCU psABI is used. Differential Revision: http://reviews.llvm.org/D13977 llvm-svn: 251222
* Remove two unnecessary conversions from MVT to EVT. NFCCraig Topper2015-10-251-2/+2
| | | | llvm-svn: 251219
* [X86][SSE] Use lowerVectorShuffleWithUNPCK instead of custom matches.Simon Pilgrim2015-10-241-104/+40
| | | | | | Most 128-bit and 256-bit shuffles were manually matching UNPCK patterns - use lowerVectorShuffleWithUNPCK to be more thorough. llvm-svn: 251211
* [X86][SSE] lowerVectorShuffleWithUNPCK - use equivalent shuffle mask test.Simon Pilgrim2015-10-241-25/+14
| | | | | | Use isShuffleEquivalent to match UNPCK shuffles - better support for build vector inputs. llvm-svn: 251207
* X86ISelLowering: Support tail calls to/from callee pop functionsHans Wennborg2015-10-241-22/+33
| | | | | | | | | This enables tail calls with thiscall, stdcall, vectorcall and fastcall functions. Differential Revision: http://reviews.llvm.org/D13999 llvm-svn: 251190
* Fix unused variable warning. NFC.Simon Pilgrim2015-10-241-2/+1
| | | | llvm-svn: 251189
* [X86][XOP] Add support for lowering vector rotationsSimon Pilgrim2015-10-241-0/+47
| | | | | | | | | | This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions. This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future. Differential Revision: http://reviews.llvm.org/D13851 llvm-svn: 251188
* [X86] Clean up the tail call eligibility logicReid Kleckner2015-10-231-32/+38
| | | | | | | | | | | | | | | | | | | Summary: The logic here isn't straightforward because our support for TargetOptions::GuaranteedTailCallOpt. Also fix a bug where we were allowing tail calls to cdecl functions from fastcall and vectorcall functions. We were special casing thiscall and stdcall callers rather than checking for any convention that requires clearing stack arguments before returning. Reviewers: hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14024 llvm-svn: 251137
* [CodeGen] Mark setjmp/catchret MBBs address-takenJoseph Tremoulet2015-10-232-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | Summary: This ensures that BranchFolding (and similar) won't remove these blocks. Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are address-taken but do not have BBs that are address-taken, since otherwise its call to getAddrLabelSymbolTableToEmit would fail an assertion on such blocks. I audited the other callers of getAddrLabelSymbolTableToEmit (and getAddrLabelSymbol); they all have BBs known to be address-taken except for the call through getAddrLabelSymbol from WinException::create32bitRef; that call is actually now unreachable, so I've removed it and updated the signature of create32bitRef. This fixes PR25168. Reviewers: majnemer, andrew.w.kaylor, rnk Subscribers: pgavlin, llvm-commits Differential Revision: http://reviews.llvm.org/D13774 llvm-svn: 251113
* Remove the last traces of X86CompilationCallback as it is completelyEric Christopher2015-10-222-78/+0
| | | | | | unused. llvm-svn: 251035
* [X86][AVX512] extend vcvtph2ps to support xmm/ymm and sae versionsAsaf Badouh2015-10-223-14/+45
| | | | | | Differential Revision: http://reviews.llvm.org/D13945 llvm-svn: 251018
* AVX-512: Fixed a bug in select_cc for i1 typeElena Demikhovsky2015-10-221-0/+1
| | | | | | | | | | | Fixed faiure: LLVM ERROR: Cannot select: t33: i1 = select_cc t25, Constant:i32<0>, t45, t42, seteq:ch added a test Differential Revision: http://reviews.llvm.org/D13943 llvm-svn: 250996
* Partially reverted changes from r250686Elena Demikhovsky2015-10-221-2/+4
| | | | | | | | Clang runtime failure was reported. Assertion failed: (isExtended() && "Type is not extended!"), function getTypeForEVT I'll need to add a proper handling for PointerType in masked load/store intrinsics. llvm-svn: 250995
* [x86] move recursive add match for LEA to helper function; NFCISanjay Patel2015-10-211-30/+37
| | | | llvm-svn: 250926
* [X86] Add AMD mwaitx, monitorx, and clzero instructions to the assembly ↵Craig Topper2015-10-211-0/+26
| | | | | | parser and disassembler. llvm-svn: 250911
* Do not use `dyn_cast<X>` after `isa<X>` (NFC)Mehdi Amini2015-10-211-1/+1
| | | | | From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 250883
* AVX512: Implemented encoding and intrinsics for VPBROADCASTB/W/D/Q instructions.Igor Breger2015-10-203-106/+109
| | | | | | Differential Revision: http://reviews.llvm.org/D13884 llvm-svn: 250819
* X86: Remove implicit ilist iterator conversions, NFCDuncan P. N. Exon Smith2015-10-198-31/+24
| | | | llvm-svn: 250741
* Remove CRLF newlines. NFC.Benjamin Kramer2015-10-191-6/+6
| | | | llvm-svn: 250698
* Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore().Elena Demikhovsky2015-10-192-14/+11
| | | | | | | | | | Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case. Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces. Differential Revision: http://reviews.llvm.org/D13850 llvm-svn: 250686
* [X86][AVX512DQ] add scalar fpclassAsaf Badouh2015-10-184-6/+61
| | | | | | Differential Revision: http://reviews.llvm.org/D13769 llvm-svn: 250650
* AVX512: Lowering i8/i16 vector CTLZ using the dword LZCNT vector instructionIgor Breger2015-10-182-23/+123
| | | | | | Differential Revision: http://reviews.llvm.org/D13632 llvm-svn: 250649
* [X86][XOP] Add VPROT instruction opcodesSimon Pilgrim2015-10-175-33/+32
| | | | | | Added X86ISD opcodes for VPROT vector rotate by variable and by immediate. llvm-svn: 250620
* Replace a custom table sort check with std::is_sorted. Change a function to ↵Craig Topper2015-10-171-17/+8
| | | | | | take ArrayRef instead of pointer and length. NFC llvm-svn: 250615
* [CostModel] Fixed AVX integer shift costsSimon Pilgrim2015-10-171-12/+36
| | | | | | Targets with AVX but without AVX2 were incorrectly reporting costs of 256-bit integer shifts. llvm-svn: 250611
* [X86][FastISel] Teach how to select SSE4A nontemporal stores.Simon Pilgrim2015-10-171-4/+15
| | | | | | | | | | Add FastISel support for SSE4A scalar float / double non-temporal stores Follow up to D13698 Differential Revision: http://reviews.llvm.org/D13773 llvm-svn: 250610
* [WinEH] Fix stack alignment in funclets and ParentFrameOffset calculationReid Kleckner2015-10-162-3/+35
| | | | | | | | | | | | Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset is incorrect when CSRs are involved. We were supposed to have a test case to catch this, but it wasn't very rigorous. The main effect here is that calling _CxxThrowException inside a catchpad doesn't immediately crash on MOVAPS when you have an odd number of CSRs. llvm-svn: 250583
* [x86] promote 'add nsw' to a wider type to allow more combinesSanjay Patel2015-10-161-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The motivation for this patch starts with PR20134: https://llvm.org/bugs/show_bug.cgi?id=20134 void foo(int *a, int i) { a[i] = a[i+1] + a[i+2]; } It seems better to produce this (14 bytes): movslq %esi, %rsi movl 0x4(%rdi,%rsi,4), %eax addl 0x8(%rdi,%rsi,4), %eax movl %eax, (%rdi,%rsi,4) Rather than this (22 bytes): leal 0x1(%rsi), %eax cltq leal 0x2(%rsi), %ecx movslq %ecx, %rcx movl (%rdi,%rcx,4), %ecx addl (%rdi,%rax,4), %ecx movslq %esi, %rax movl %ecx, (%rdi,%rax,4) The most basic problem (the first test case in the patch combines constants) should also be fixed in InstCombine, but it gets more complicated after that because we need to consider architecture and micro-architecture. For example, AArch64 may not see any benefit from the more general transform because the ISA solves the sexting in hardware. Some x86 chips may not want to replace 2 ADD insts with 1 LEA, and there's an attribute for that: FeatureSlowLEA. But I suspect that doesn't go far enough or maybe it's not getting used when it should; I'm also not sure if FeatureSlowLEA should also mean "slow complex addressing mode". I see no perf differences on test-suite with this change running on AMD Jaguar, but I see small code size improvements when building clang and the LLVM tools with the patched compiler. A more general solution to the sext(add nsw(x, C)) problem that works for multiple targets is available in CodeGenPrepare, but it may take quite a bit more work to get that to fire on all of the test cases that this patch takes care of. Differential Revision: http://reviews.llvm.org/D13757 llvm-svn: 250560
* Fix assertion failure with fp128 to unsigned i64 conversionAndrew Kaylor2015-10-161-9/+5
| | | | | | | | Patch by Mitch Bodart Differential Revision: http://reviews.llvm.org/D13780 llvm-svn: 250550
* [X86] Add fxsr feature flag for fxsave/fxrestore instructions.Craig Topper2015-10-165-42/+74
| | | | llvm-svn: 250497
* Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android."Evgeniy Stepanov2015-10-152-11/+9
| | | | | | Breaks the hexagon buildbot. llvm-svn: 250461
* [safestack] Fast access to the unsafe stack pointer on AArch64/Android.Evgeniy Stepanov2015-10-152-9/+11
| | | | | | | | | | | | | | | | Android libc provides a fixed TLS slot for the unsafe stack pointer, and this change implements direct access to that slot on AArch64 via __builtin_thread_pointer() + offset. This change also moves more code into TargetLowering and its target-specific subclasses to get rid of target-specific codegen in SafeStackPass. This change does not touch the ARM backend because ARM lowers builting_thread_pointer as aeabi_read_tp, which is not available on Android. llvm-svn: 250456
* x86: preserve flags when folding atomic operationsJF Bastien2015-10-151-15/+20
| | | | | | | | | | D4796 taught LLVM to fold some atomic integer operations into a single instruction. The pattern was unaware that the instructions clobbered flags. I fixed some of this issue in D13680 but had missed INC/DEC. This patch adds the missing EFLAGS definition. llvm-svn: 250438
* x86 FP atomic codegen: don't drop globals, stackJF Bastien2015-10-151-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: x86 codegen is clever about generating good code for relaxed floating-point operations, but it was being silly when globals and immediates were involved, forgetting where the global was and loading/storing from/to the wrong place. The same applied to hard-coded address immediates. Don't let it forget about the displacement. This fixes https://llvm.org/bugs/show_bug.cgi?id=25171 A very similar bug when doing floating-points atomics to the stack is also fixed by this patch. This fixes https://llvm.org/bugs/show_bug.cgi?id=25144 Reviewers: pete Subscribers: llvm-commits, majnemer, rsmith Differential Revision: http://reviews.llvm.org/D13749 llvm-svn: 250429
* [X86] Rip out orphaned method declarations and other dead code. NFC.Benjamin Kramer2015-10-157-53/+0
| | | | llvm-svn: 250406
* AVX512: Implemented DAG lowering for shuff62x2/shufi62x2 instructions ( ↵Igor Breger2015-10-154-1/+133
| | | | | | | | shuffle packed values at 128-bit granularity ) Differential Revision: http://reviews.llvm.org/D13648 llvm-svn: 250400
* AVX512: Implemented encoding and intrinsics for vpternlogd/q.Igor Breger2015-10-155-4/+98
| | | | | | Differential Revision: http://reviews.llvm.org/D13768 llvm-svn: 250396
* AVX-512: Fixed a bug in shuffle lowering 32-bit modeElena Demikhovsky2015-10-151-6/+35
| | | | | | | | | AVX-512 bit shuffle fails on 32 bit since we create a vector of 64-bit constants. I split 8x64-bit const vector to 16x32 on 32-bit mode. Differential Revision: http://reviews.llvm.org/D13644 llvm-svn: 250390
* Add XSAVE/XSAVEOPT to KNL processor.Craig Topper2015-10-151-0/+2
| | | | llvm-svn: 250362
* [x86][FastISel] Teach how to select nontemporal stores.Andrea Di Biagio2015-10-141-16/+34
| | | | | | | | | | | | | | | | | | | | | | This patch teaches x86 fast-isel how to select nontemporal stores. On x86, we can use MOVNTI for nontemporal stores of doublewords/quadwords. Instructions (V)MOVNTPS/PD/DQ can be used for SSE2/AVX aligned nontemporal vector stores. Before this patch, fast-isel always selected 'movd/movq' instead of 'movnti' for doubleword/quadword nontemporal stores. In the case of nontemporal stores of aligned vectors, fast-isel always selected movaps/movapd/movdqa instead of movntps/movntpd/movntdq. With this patch, if we use SSE2/AVX intrinsics for nontemporal stores we now always get the expected (V)MOVNT instructions. The lack of fast-isel support for nontemporal stores was spotted when analyzing the -O0 codegen for nontemporal stores. Differential Revision: http://reviews.llvm.org/D13698 llvm-svn: 250285
* [X86] Add XSAVE feature flags to their various processors.Craig Topper2015-10-141-3/+23
| | | | llvm-svn: 250268
* function names should start with a lower case letter; NFCSanjay Patel2015-10-134-116/+116
| | | | llvm-svn: 250174
OpenPOWER on IntegriCloud